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Methods of Diagnosis of Colorectal Cancer, Compositions and Methods of 
Screening for Colorectal Cancer Modulators 

5 CROSS-REFERENCES TO RELATED APPLICATIONS 

[01] This application is a continuation in part of US Patent Application 
USSN 09/663,733 filed September 15, 2000, and US Patent Application filed August 14, 
2001 USSN, not yet known, which are incorporated herein by reference in their entirety* 

10 FIELD OF THE INVENTION 

[02] The invention relates to the identification of expression profiles and the 
nucleic acids involved in colorectal cancer, and to the use of such expression profiles and 
nucleic acids in diagnosis and prognosis of colorectal cancer. The invention further relates to 
methods for identifying and using candidate agents and/or targets which modulate colorectal 

15 cancer. 

BACKGROUND OF THE INVENTION 
[03] Cancer of the colon and/or rectum (referred to as "colorectal cancer") 
are significant in Western populations and particularly in the United States. Cancers of the 
colon and rectum occur in both men and women most commonly after the age of 50. These 

20 develop as the result of a pathologic transformation of normal colon epithelium to an invasive 
cancer. There have been a number of recently characterized genetic alterations that have 
been implicated in colorectal cancer, including mutations in two classes of genes, tumor- 
suppressor genes and proto-oncogenes, with recent work suggesting that mutations in DNA 
repair genes may also be involved in tumorigenesis. For example, inactivating mutations of 

25 both alleles of the adenomatous polyposis coli (APC) gene, a tumor suppressor gene, appears 
to be one of the earliest events in colorectal cancer, and may even be the initiating event. 
Other genes implicated in colorectal cancer include the MCC gene, the p53 gene, the DCC 
(deleted in colorectal carcinoma) gene and other chromosome 18q genes, and genes in the 
TGF-fJ signaling pathway. For a review, see Molecular Biology of Colorectal Cancer, pp. 

30 238-299, in Curr. Probl Cancer, Sept/Oct 1997; see also Willams, Colorectal Cancer 

(1996); Kinsella & Schofield, Colorectal Cancer: A Scientific Perspective (1993); Colorectal 

1 
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Cancer: Molecular Mechanisms, Premalignant State and its Prevention (Schmiegel & 
Scholmerich eds., 2000); Colorectal Cancer: New Aspects of Molecular Biology and Their 
Clinical Applications (Hanski et aL, eds 2000); McArdle et aL, Colorectal Cancer (2000); 
Wanebo, Colorectal Cancer (1993); Levin, The American Cancer Society: Colorectal Cancer 
5 (1999); Treatment of Hepatic Metastases of Colorectal Cancer (Nordlinger & Jaeck eds., 
1993); Management of Colorectal Cancer (Dunitz et aL, eds. 1998); Cancer: Principles and 
Practice ofOticology (Devita et aL, eds. 2001); Surgical Oncology: Contemporary Principles 
and Practice (Kirby et al, eds. 2001); Offit, Clinical Cancer Genetics: Risk Counseling and 
Management (1997); Radioimmunotherapy of Cancer (Abrams & Fritzberg eds. 2000); 

10 Fleming, AJCC Cancer Staging Handbook (1998); Textbook of Radiation Oncology (Leibel 
& Phillips eds. 2000); and Clinical Oncology (Abeloff et aL, eds. 2000). 

[04] Imaging of colorectal cancer for diagnosis has been problematic and 
limited. In addition, metastasis of the tumor to the lumen, and metastasis of tumor cells to 
regional lymph nodes are important prognostic factors (see, e.g., PET in Oncology: Basics 

15 and Clinical Application (Ruhlmann et aL eds. 1999). For example, five year survival rates 
drop from 80 percent in patients with no lymph node metastases to 45 to 50 percent in those 
patients who do have lymph node metastases. A recent report showed that micrometastases 
can be detected from lymph nodes using reverse transcriptase-PCR methods based on the 
presence of mRNA for carcinoembryonic antigen, which has previously been shown to be 

20 present in the vast majority of colorectal cancers but not in normal tissues. Liefers et aL , New 
England J. of Med. 339(4):223 (1998). 

[05] Thus, methods that can be used for diagnosis and prognosis of 
colorectal cancer would be desirable. Accordingly, provided herein are methods that can be 
Used in diagnosis and prognosis of colorectal cancer. Further provided are methods that can 

25 be used to screen candidate bioactive agents for the ability to modulate colorectal cancer. 
Additionally, provided herein are molecular targets for therapeutic intervention in colorectal 
and other cancers. 

BRIEF SUMMARY OF THE INVENTION 
30 [06] The present invention provides novel methods for diagnosis and 

prognosis evaluation for colorectal cancer, as well as methods for screening for compositions 
which modulate colorectal cancer. Methods of treatment of colorectal cancer, as well as 
compositions, are also provided herein. 
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[07] In one aspect, a method of screening drug candidates comprises 
providing a cell that expresses an expression profile gene selected from those of Table I. The 
method further includes adding a drug candidate to the cell and determining the effect of the 
drug candidate on the expression of the expression profile gene. 
5 [08] In one embodiment, the method of screening drug candidates includes 

comparing the level of expression in the absence of the drug candidate to the level of 
expression in the presence of the drug candidate, wherein the concentration of the drug 
candidate can vary when present, and wherein the comparison can occur after addition or 
removal of the drug candidate. In a preferred embodiment, the cell expresses at least two 
1 0 expression profile genes. The profile genes may show an increase or decrease. 

[09] Also provided herein is a method of screening for a bioactive agent 
capable of binding to a colorectal cancer modulator protein, the method comprising 
combining the colorectal cancer modulator protein and a candidate bioactive agent, and 
determining the binding of the candidate agent to the colorectal cancer modulator protein. 
1 5 Preferably the colorectal cancer modulator protein is a product encoded by a gene of Table 1 
or Table 2. 

[1 0] Further provided herein is a method for screening for a bioactive agent 

capable of modulating the activity of a colorectal cancer modulator protein. In one 

embodiment, the method comprises combining the colorectal cancer modulator protein and a 
20 candidate bioactive agent, and determining the effect of the candidate agent on the bioactivity 

of the colorectal cancer modulator protein. Preferably the colorectal cancer modulator 

protein is a product encoded by a gene of Table 1 or Table 2. 

[11] Also provided is a method of evaluating the effect of a candidate 

colorectal cancer drug comprising administering the drug to a transgenic animal expressing or 
25 over-expressing the colorectal cancer modulator protein, or an animal lacking the colorectal 

cancer modulator protein, for example as a result of a gene knockout. 

[12] Additionally, provided herein is a method of evaluating the effect of a 

candidate colorectal cancer drug comprising administering the drug to a patient and removing 

a cell sample from the patient The expression profile of the cell is then determined. This 
30 method may further comprise comparing the expression profile to an expression profile of a 

healthy individual. In a preferred embodiment, said expression profile includes a gene of 

Table lor Table 2. 
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[13] Moreover, provided herein is a biochip comprising one or more nucleic 
acid segments of Table 1 or Table 2, wherein the biochip comprises fewer than 1000 nucleic 
acid probes. Preferable at least two nucleic acid segments are included. 

[14] Furthermore, a method of diagnosing a disorder associated with 
5 colorectal cancer is provided The method comprises detennining the expression of a gene of 
Table 1 or Table 2, in a first tissue type of a first individual, and comparing the distribution to 
the expression of the gene from a second normal tissue type from the first individual or a 
second unaffected individual. A difference in the expression indicates that the first individual 
has a disorder associated with colorectal cancer. 

10 [15] In another aspect, the present invention provides an antibody which 

specifically binds to a protein encoded by a nucleic acid of Table 1 or Table 2 or a fragment 
thereof. Preferably the antibody is a monoclonal antibody. The antibody can be a fragment 
of an antibody such as a single stranded antibody as further described herein, or can be 
conjugated to another molecule. In one embodiment, the antibody is a humanized antibody, 

15 [16] In one embodiment a method for screening for a bioactive agent 

capable of interfering with the binding of a colorectal cancer modulating protein (colorectal 
cancer modulator protein) or a fragment thereof and an antibody which binds to said 
colorectal cancer modulator protein or fragment thereof. In a preferred embodiment, the 
method comprises combining a colorectal cancer modulator protein or fragment thereof a 

20 candidate bioactive agent and an antibody which binds to said colorectal cancer modulator 
protein or fragment thereof. The method further includes determining the binding of said 
colorectal cancer modulator protein or fragment thereof and said antibody. Wherein there is 
a change in binding, an agent is identified as an interfering agent. The interfering agent can 
be an agonist or an antagonist. Preferably, the agent inhibits colorectal cancer. 

25 [1 7] In a further aspect, a method for inhibiting colorectal cancer is 

provided. The method can be performed in vitro or in vivo, preferably in vivo to an 
individual. In a preferred embodiment the method of inhibiting colorectal cancer is provided 
to an individual with cancer. As described herein, methods of inhibiting colorectal cancer 
can be performed by administering an inhibitor of the activity of a protein encoded by a 

30 nucleic acid of Table 1 or Table 2, including an antisense molecule to the gene or its gene 
product. 

[1 8] Also provided herein are methods of eliciting an immune response in 
an individual. In one embodiment a method provided herein comprises administering to an 
individual a composition comprising a colorectal cancer modulating protein, or a fragment 
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thereof. In another embodiment, the protein is encoded by a nucleic acid selected from those 
of Table 1 or Table 2. In another aspect, said composition comprises a nucleic acid 
comprising a sequence encoding a colorectal cancer modulating protein, or a fragment 
thereof. 

5 [19] Further provided herein are compositions capable of eliciting an 

immune response in an individual. In one embodiment, a composition provided herein 
comprises a colorectal cancer modulating protein, preferably encoded by a nucleic acid of 
Table 1 or Table 2, or a fragment thereof, and a pharmaceutically acceptable carrier. In 
another embodiment, said composition comprises a nucleic acid comprising a sequence 

1 0 encoding a colorectal cancer modulating protein, preferably selected from the nucleic acids of 
Table 1 or Table 2 and a pharmaceutically acceptable carrier. 

[20] Also provided are methods of neutralizing the effect of a colorectal 
cancer protein, or a fragment thereof, comprising contacting an agent specific for said protein 
with said protein in an amount sufficient to effect neutralization. In another embodiment, the 

1 5 protein is encoded by a nucleic acid selected from those of Table 1 or Table 2. 

[21] In another aspect of the invention, a method of treating an individual 
for colorectal cancer is provided. In one embodiment, the method comprises administering to 
said individual an inhibitor of a colorectal cancer modulating protein. In another 
embodiment, the method comprises administering to a patient having colorectal cancer an 

20 antibody to a colorectal cancer modulating protein conjugated to a therapeutic moiety. Such 
a therapeutic moiety can be a cytotoxic agent or a radioisotope. 

[22] Compounds and compositions are also provided. Other aspects of the 
invention will become apparent to the skilled artisan by the following description of the 
invention. 

25 BRIEF DESCRIPTION OF THE DRAWINGS 

[NOT APPLICABLE] 

DETAILED DESCRIPTION OF THE INVENTION 
[23] The present invention provides novel methods for diagnosis and 
30 prognosis evaluation for colorectal cancer, as well as methods for screening for compositions 
which modulate colorectal cancer. The methods herein are related to those of U.S. Patent 
Application Serial No. 09/525,993 and International Patent Application No. 
PCT/US00/07044, each of which is incorporated herein in its entirety. 
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[24] By "colorectal cancer" herein is meant a colon and/or rectal tumor or 
cancer that is classified as Dukes stage A or B as well as metastatic tumors classified as 
Dukes stage Cor D (see, e.g., Cohen et al, Cancer of the Colon, in Cancer: Principles and 
Practice of Oncology, pp. 1 144-1 197 (Devita et al, eds., 5 th ed. 1997); see also Harrison 's 
5 Principles of Internal Medicine, pp. 1289-129 (Wilson et al, eds., 12 th ed., 1991). 

'Treatment, monitoring, detection or modulation of colorectal cancer" includes treatment, 
monitoring, detection, or modulation of colorectal disease in those patients who have 
colorectal disease (Dukes stage A , B, C or D) in which gene expression from a gene in Table 
1 or 2, is increased or decreased, indicating that the subject is more likely to progress to 

10 metastatic disease than a patient who does not have an increase or decrease in gene 

expression of a gene in Table 1 or 2. In Dukes stage A, the tumor has penetrated into, but not 
through, the bowel wall. In Dukes stage B, the tumor has penetrated through the bowel wall 
but there is not yet any lymph involvement. In Dukes stage C, the cancer involves regional 
lymph nodes. In Dukes stage D, there is distant metastasis, e.g., liver, lung, etc. 

1 5 [25] Table 1 provides unigene cluster identification numbers for tbe 

nucleotide sequence of genes that exhibit increased expression in colorectal cancer samples. 
Tables 1 also provides an exemplar accession number that provides a nucleotide sequence 
that is part of the unigene cluster. Table 2 provides the nucleic acid and protein sequence of 
the CBF9 gene as well as the Unigene and Exemplar accession numbers for CBF9. 

20 [26] In one aspect, the expression levels of genes are determined in 

different patient samples for which either diagnosis or prognosis information is desired, to 
provide expression profiles. An expression profile of a particular sample is essentially a 
"fingerprint" of the state of the sample; while two states may have any particular gene 
similarly expressed, the evaluation of a number of genes simultaneously allows the 

25 generation of a gene expression profile that is unique to the state of the cell. That is, normal 
tissue may be distinguished from colorectal cancer tissue, and within colorectal cancer 
tissue, different prognosis states (good or poor long term survival prospects, for example) 
may be determined. By comparing expression profiles of colon tissue in known different 
states, information regarding which genes are important (including both up- and down- 

30 regulation of genes) in each of these states is obtained. The identification of sequences that 
are differentially expressed in colorectal cancer versus normal colon tissue, as well as 
differential expression resulting in different prognostic outcomes, allows the use of this 
information in a number of ways. For example, the evaluation of a particular treatment 
regime may be evaluated: does a chemotherapeutic drug act to improve the long-term 
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prognosis in a particular patient Similarly, diagnosis may be done or confirmed by 
comparing patient samples with the known expression profiles. Furthermore, these gene 
expression profiles (or individual genes) allow screening of drug candidates with an eye to 
mimicking or altering a particular expression profile; for example, screening can be done for 
5 drugs that suppress the colorectal cancer expression profile or convert a poor prognosis 

profile to a better prognosis profile. This may be done by making biochips comprising sets of 
the important colorectal cancer genes, which can then be used in these screens. These 
methods can also be done on the protein basis; that is, protein expression levels of the 
colorectal cancer proteins can be evaluated for diagnostic and prognostic purposes or to 

10 screen candidate agents. In addition, the colorectal cancer nucleic acid sequences can be 
administered for gene therapy purposes, including the administration of antisense nucleic 
acids, or the colorectal cancer proteins (including antibodies and other modulators thereof) 
administered as therapeutic drugs. 

[27] Thus the present invention provides nucleic acid and protein 

1 5 sequences that are differentially expressed in colorectal cancer, herein termed "colorectal 
cancer sequences". As outlined below, colorectal cancer sequences include those that are 
up-regulated (i.e. expressed at a higher level) in colorectal cancer , as well as those that are 
down-regulated (i.e. expressed at a lower level) in colorectal cancer . In a preferred 
embodiment, the colorectal cancer sequences are from humans; however, as will be 

20 appreciated by those in the art, colorectal cancer sequences from other organisms may be 
useful in animal models of disease and drug evaluation; thus, other colorectal cancer 
sequences are provided, from vertebrates, including mammals, including ^rodents (rats, mice, 
hamsters, guinea pigs, etc.), primates, farm animals (including sheep, goats, pigs, cows, 
horses, etc), colorectal cancer sequences from other organisms may be obtained using the 

25 techniques outlined below. 

[28] Colorectal cancer sequences can include both nucleic acid and amino 
acid sequences. In a preferred embodiment, the colorectal cancer sequences are recombinant 
nucleic acids. By the term "recombinant nucleic.acid" herein is meant nucleic acid, originally 
formed in vitro, in general, by the manipulation of nucleic acid by polymerases and 

30 endonucleases, in a form not normally found in nature. Thus an isolated nucleic acid, in a 
linear form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e. using the in vivo cellular machinery of the 
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host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
recombinant for the purposes of the invention. 

[29] Similarly, a "recombinant protein" is a protein made using recombinant 
5 techniques, i.e. through the expression of a recombinant nucleic acid as depicted above. A 
recombinant protein is distinguished from naturally occurring protein by at least one or more 
characteristics. For example, the protein may be isolated or purified away from some or all 
of the proteins and compounds with which it is normally associated in its wild type host, and 
thus may be substantially pure. For example, an isolated protein is unaccompanied by at least 

10 some of the material with which it is normally associated in its natural state, preferably 
constituting at least about 0.5%, more preferably at least about 5% by weight of the total 
protein in a given sample. A substantially pure protein comprises at least about 75% by 
weight of the total protein, with at least about 80% being preferred, and at least about 90% 
being particularly preferred. The definition includes the production of a colorectal cancer 

15 protein from one organism in a different organism or host cell. Alternatively, the protein may 
be made at a significantly higher concentration than is normally seen, through the use of an 
inducible promoter or high expression promoter, such that the protein is made at increased 
concentration levels. Alternatively, the protein may be in a form not normally found in 
nature, as in the addition of an epitope tag or amino acid substitutions, insertions and 

20 deletions, as discussed below. 

[30] In a preferred embodiment, the colorectal cancer sequences are 
nucleic acids. As will be appreciated by those in the art and is more fully outlined below, 
colorectal cancer sequences are useful in a variety of applications, including diagnostic 
applications, which will detect naturally occurring nucleic acids, as well as screening 

25 applications; for example, biochips comprising nucleic acid probes to the colorectal cancer 
sequences can be generated. In the broadest sense, then, by "nucleic acid" or 
"oligonucleotide" or grammatical equivalents herein means at least two nucleotides 
covalently linked together. A nucleic acid of the present invention will generally contain 
phosphodiester bonds, although in some cases, as outlined below, nucleic acid analogs are 

30 included that may have alternate backbones, comprising, for example, phosphoramidate 
(Beaucage et al., Tetrahedron 49(10):1925 (1993) and references therein; Letsinger, J. Org. 
Cbem. 35:3800 (1970); Sprinzl et al., Eur. J. Biochem. 81:579 (1977); Letsinger et al., Nucl. 
Acids Res. 14:3487 (1986); Sawai et al, Chem. Lett. 805 (1984), Letsinger et al., J. Am. 
Chem. Soc. 110:4470 (1988); andPauwels et al., Chemica Scripta 26:141 91986)), 
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phosphorothioate (Mag et al., Nucleic Acids Res. 19:1437 (1991); and U.S. Patent No. 
5,644,048), phosphorodithioate (Briu et aL, J. Am. Chem. Soc. 111:2321 (1989), 0- 
methylphophoroamidite linkages (see Eckstein, Oligonucleotides and Analogues: A Practical 
Approach, Oxford University Press), and peptide nucleic acid backbones and linkages (see 
5 Egholm, J. Am. Chem. Soc. 114:1895 (1992); Meier et al., Chem. Int. Ed. Engl. 31:1008 
. (1992); Nielsen, Nature, 365:566 (1993); Carlsson et al., Nature 380:207 (1996), all of which 
are incorporated by reference). Other analog nucleic acids include those with positive 
backbones (Denpcy et al., Proc. Natl. Acad Sci. USA 92:6097 (1995); non-ionic backbones 
(U.S. Patent Nos. 5,386,023, 5,637,684, 5,602,240, 5,216,141'and 4,469,863; Kiedrowshi et 

10 al., Angew. Chem. Intl. Ed English 30:423 (1991); Letsinger et al., J. Am. Chem. Soc. 

110:4470 (1988); Letsinger et al., Nucleoside & Nucleotide 13:1597 (1994); Chapters 2 and 
3, ASC Symposium Series 580, "Carbohydrate Modifications in Antisense Research", Ed. 
Y.S. Sanghui and P. Dan Cook; Mesmaeker et al., Bioorganic & Medicinal Chem. Lett. 4:395 
(1994); Jeflfe et al., J. Biomolecular NMR 34:17 (1994); Tetrahedron Lett. 37:743 (1996)) and 

1 5 non-ribose backbones, including those described in U.S. Patent Nos. 5,235,033 and 

5,034,506, and Chapters 6 and 7, ASC Symposium Series 580, "Carbohydrate Modifications 
in Antisense Research", Ed. Y.S. Sanghui and P. Dan Cook. Nucleic acids containing one or 
more carbocyclic sugars are also included within one definition of nucleic acids (see Jenkins 
etal., Chem. Soc. Rev. (1995) ppl69-176). Several nucleic acid analogs are described in 

20 Rawls, C & E News June 2, 1 997 page 35. All of these references are hereby expressly 
incorporated by reference. These modifications of the ribose-phosphate backbone may be 
done for a variety of reasons, for example to increase the stability and half-life of such 
molecules in physiological environments or as probes on a biochip. 

[31] As will be appreciated by those in the art, all of these nucleic acid 

25 analogs may find use in the present invention. In addition, mixtures of naturally occurring 
nucleic acids and analogs can be made; alternatively, mixtures of different nucleic acid 
analogs, and mixtures of naturally occurring nucleic acids and analogs may be made. 

[32] Particularly preferred are peptide nucleic acids (PNA) which includes 
peptide nucleic acid analogs. These backbones are substantially non-ionic under neutral 

30 conditions, in contrast to the highly charged phosphodiester backbone of naturally occurring 
nucleic acids. This results in two advantages. First, the PNA backbone exhibits improved 
hybridization kinetics. PNAs have larger changes in the melting temperature (Tm) for 
' mismatched versus perfectly matched basepairs. DNA and RNA typically exhibit a 2-4°C 
drop in Tm for an internal mismatch. With the non-ionic PNA backbone, the drop is closer to 
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7-9°C. Similarly, due to their non-ionic nature, hybridization of the bases attached to these 
backbones is relatively insensitive to salt concentration. In addition, PNAs are not degraded 
by cellular enzymes, and thus can be more stable. 

[33] The nucleic acids may be single stranded or double stranded, as 
5 specified, or contain portions of both double stranded or single stranded sequence. As will be 
appreciated by those in the art, the depiction of a single strand ("Watson") also defines the 
sequence of the other strand ("Crick"); thus the sequences described herein also includes the 
complement of the sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA 
or a hybrid, where the nucleic acid contains any combination of deoxyribo- and ribo- 

10 nucleotides, and any combination of bases, including uracil, adenine, thymine, cytosine, 
guanine, inosine, xanthine hypoxanthine, isocytosine, isoguanine, etc. As used herein, the 
term "nucleoside" includes nucleotides and nucleoside and nucleotide analogs, and modified 
nucleosides such as amino modified nucleosides. In addition, "nucleoside" includes non- 
naturally occurring analog structures. Thus for example the individual units of a peptide 

15 nucleic acid, each containing a base, are referred to herein as a nucleoside. 

[34] A colorectal cancer sequence can be initially identified by substantial 
nucleic acid and/or amino acid sequence homology to the colorectal cancer sequences 
outlined herein. Such homology can be based upon the overall nucleic acid or amino acid 
sequence, and is generally determined as outlined below, using either homology programs or 

20 hybridization conditions. 

[35] The isolation of mRNA comprises isolating total cellular RNA by 
disrupting a cell and performing differential centrifugation. Once the total RNA is isolated, 
mRNA is isolated by making use of the adenine nucleotide residues known to those skilled in 
the art as a poly (A) tail found on virtually every eukaryotic mRNA molecule at the 3'end 

25 thereof. Oligonucleotides composed of only deoxythymidine [olgo(dT)] are linked to 

cellulose and the oligo(dl)-cellulose packed into small columns. When a preparation of total 
cellular RNA is passed through such a column, the mRNA molecules bind to the oligo(dT) by 
the poly (A) tails while the rest of the RNA flows through the column. The bound mRNAs 
are then eluted from the column and collected. 

30 [36] The colorectal cancer sequences of the invention can be identified as 

follows. Samples of normal and tumor tissue are applied to biochips comprising.nucleic acid 
probes. The samples are first microdissected, if applicable, and treated as described above 
for the preparation of mRNA. Suitable biochips are commercially available, for example 

10 
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from Affymetrix. Gene expression profiles as described herein are generated, and the data 
analyzed. 

[37] In a preferred embodiment, the genes showing changes in expression 
as between normal and disease states are compared to genes expressed in other normal 
5 tissues, including, but not limited to lung, heart, brain, liver, breast, kidney, muscle, prostate, 
small intestine, large intestine, spleen, bone, and placenta. In a preferred embodiment, those 
genes identified during the colorectal cancer screen that are expressed in any significant 
amount in other tissues are removed from the profile, although in some embodiments, this is 
not necessary. That is, when screening for drugs, it is preferable that the target be disease 

10 specific, to minimize possible side effects. 

[38] In a preferred embodiment, colorectal cancer sequences are those that 
are up-regulated in colorectal cancer ; that is, the expression of these genes is higher in 
colorectal carcinoma as compared to normal colon tissue. 'Tip-regulation" as used herein 
means at least about a 1.1 fold change, preferably a 1.5 or two fold change, preferably at least 

1 5 about a three fold change, with at least about five-fold or higher being preferred. All 

accession numbers herein are for the GenBank sequence database and the sequences of the 
accession numbers are hereby expressly incorporated by reference. GenBank is known in the 
art, see, e.g., Benson, DA, et al., Nucleic Acids Research 26:1-7 (1998) and 
http://www.ncbi.nlm.nih.gov/. In addition, these genes were found to be expressed in a 

20 limited amount or not at all in heart, brain, lung, liver, breast, kidney, prostate, small intestine 
and spleen. 

[39] In a preferred embodiment, colorectal cancer sequences are those that 
are down-regulated in colorectal cancer ; that is, the expression of these genes is lower in 
colorectal carcinoma as compared to normal colon tissue. *T)own-regulation" as used herein 

25 means at least about a two-fold change, preferably at least about a three fold change, with at 
least about five-fold or higher being preferred. 

[40] Colorectal cancer proteins of the present invention may be classified 
as secreted proteins, transmembrane proteins or intracellular proteins. In a preferred 
embodiment the colorectal cancer protein is an intracellular protein. Intracellular proteins 

30 may be found in the cytoplasm and/or in the nucleus. Intracellular proteins are involved in all 
aspects of cellular function and replication (including, for example, signaling pathways); 
aberrant expression of such proteins results in unregulated or disregulated cellular processes. 
For example, many intracellular proteins have enzymatic activity such as protein kinase 
activity, protein phosphatase activity, protease activity, nucleotide cyclase activity, 
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polymerase activity and the like. Intracellular proteins also serve as docking proteins that are 
involved in organizing complexes of proteins, or targeting proteins to various subcellular 
localizations, and are involved in maintaining the structural integrity of organelles. 

[41] An increasingly appreciated concept in characterizing intracellular 
5 proteins is the presence in the proteins of one or more motifs for which defined functions 
have been attributed. In addition to the highly conserved sequences found in the enzymatic 
domain of proteins, highly conserved sequences have been identified in proteins that are 
involved in protein-protein interaction. For example, Src-homology-2 (SH2) domains bind 
tyrosine-phosphorylated targets in a sequence dependent manner. PTB domains, which are 

10 distinct from SH2 domains, also bind tyrosine phosphorylated targets. SH3 domains bind to 
proline-rich targets. In addition, PH domains, tetratricopeptide repeats and WD domains to 
name only a few, have been shown to mediate protein-protein interactions. Some of these 
may also be involved in binding to phospholipids or other second messengers. As will be 
appreciated by one of ordinary skill in the art, these motifs can be identified on the basis of 

15 primary sequence; thus, an analysis of the sequence of proteins may provide insight into both 
the enzymatic potential of the molecule and/or molecules with which the protein may 
associate. 

[42] In a preferred embodiment, the colorectal cancer sequences are 
transmembrane proteins. Transmembrane proteins are molecules that span the phospholipid 

20 bilayer of a cell. They may have an intracellular domain, an extracellular domain, or both. 
The intracellular domains of such proteins may have a number of functions including those 
already described for intracellular proteins. For example, the intracellular domain may have 
enzymatic activity and/or may serve as a binding site for additional proteins. Frequently the 
intracellular domain of transmembrane proteins serves both roles. For example certain 

25 receptor tyrosine kinases have both protein kinase activity and SH2 domains. In addition, 
autophosphorylation of tyrosines on the receptor molecule itself creates binding sites for 
additional SH2 domain containing proteins. 

[43] Transmembrane proteins may contain from one to many 
transmembrane domains. For example, receptor tyrosine kinases, certain cytokine receptors, 

30 receptor guanylyl cyclases and receptor serine/threonine protein kinases contain a single 
transmembrane domain. However, various other proteins including channels and adenylyl 
cyclases contain numerous transmembrane domains. Many important cell surface receptors 
are classified as "seven transmembrane domain" proteins, as they contain 7 membrane 
spanning regions. Important transmembrane protein receptors include, but are not limited to 

12 
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insulin receptor, insulin-like growth factor receptor, human growth hormone Teceptor, 
glucose transporters, trmsferrin receptor, epidermal growth factor receptor, low density 
lipoprotein receptor, epidermal growth factor receptor, leptin receptor, interleukin receptors, 
e.g. IL-1 receptor, IL-2 receptor, etc. 
5 [44] Characteristics of transmembrane domains include approximately 20 

consecutive hydrophobic amino acids that may be followed by charged amino acids. 
Therefore, upon analysis of the amino acid sequence of a particular protein, the localization 
and number of transmembrane domains within the protein may be predicted. 

[45] The extracellular domains of transmembrane proteins are diverse; 

1 0 however, conserved motifs are found repeatedly among various extracellular domains. 

Conserved structure and/or functions have been ascribed to different extracellular motifs. For 
example, cytokine receptors are characterized by a cluster of cysteines and a WSXWS (W= 
tryptophan, S= serine, X=any amino acid) motif. Immunoglobulin-like domains are highly 
conserved, Mucin-like domains may be involved in cell adhesion and leucine-rich repeats 

1 5 participate in protein-protein interactions. 

[46] Many extracellular domains are involved in binding to other 
molecules. In one aspect, extracellular domains are receptors. Factors that bind the receptor 
domain include circulating ligands, which may be peptides, proteins, or small molecules such 
as adenosine and the like. For example, growth factors such as EGF, FGF and PDGF are 

20 circulating growth factors that bind to their cognate receptors to initiate a variety of cellular 
responses. Other factors include cytokines, mitogenic factors, neurotrophic factors and the 
like. Extracellular domains also bind to cell-associated molecules. In this respect, they 
mediate cell-cell interactions. Cell-associated ligands can be tethered to the cell for example 
via a glycosylphosphatidylinositol (GPI) anchor, or may themselves be transmembrane 

25 proteins. Extracellular domains also associate with the extracellular matrix and contribute to 
the maintenance of the cell structure. 

[47] Colorectal cancer proteins that are transmembrane are particularly 
preferred in the present invention as they are good targets for immunotherapeutics, as are 
described herein. In addition, as outlined below, transmembrane proteins can be also useful 

30 in imaging modalities. 

[48] It will also be appreciated by those in the art that a transmembrane 
protein can be made soluble by removing transmembrane sequences, for example through 
recombinant methods. Furthermore, transmembrane proteins that have been made soluble 

13 
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can be made to be secreted through recombinant means by adding an appropriate signal 
sequence. 

[49] In a preferred embodiment, the colorectal cancer proteins are secreted 
proteins; the secretion of which can be either constitutive or regulated. These proteins have a 
5 signal peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 
proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they serve to transmit signals to various other cell types. The secreted protein may function in 
an autocrine manner (acting on the cell that secreted the factor), a paracrine manner (acting 
on cells in close proximity to the cell that secreted the factor) or an endocrine manner (acting 

10 on cells at a distance). Thus secreted molecules find use in modulating or altering numerous 
aspects of physiology, colorectal cancer proteins that are secreted proteins are particularly 
preferred in the present invention as they serve as good targets for diagnostic markers, for 
example for blood tests. 

[50] A colorectal cancer sequence is initially identified by substantial 

15 nucleic acid and/or amino acid sequence homology to the colorectal cancer sequences 
outlined herein. Such homology can be based upon the overall nucleic acid or amino acid 
sequence, and is generally determined as outlined below, using either homology programs or 
hybridization conditions. 

[51] As used herein, the terms "colorectal cancer nucleic acid", "colorectal 

20 cancer protein" or "colorectal cancer polynucleotide" or "colorectal cancer-associated 

transcript* 9 refers to nucleic acid and polypeptide polymorphic variants, alleles, mutants, and 
interspecies homologs that: (1) have a nucleotide sequence that has greater than about 60% 
nucleotide sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 
94%, 95%, 96%, 97%, 98% or 99% or greater nucleotide sequence identity, preferably over a 

25 region of over a region of at least about 25, 50, 100, 200, 500, 1000, or more nucleotides, to a 
nucleotide sequence of or associated with a unigene cluster of Tables 1 or Table 2; (2) bind to 
antibodies, e.g., polyclonal antibodies, raised against an immunogen comprising an amino 
acid sequence encoded by a nucleotide sequence of or associated with a unigene cluster of 
Table 1 or Table 2, and conservatively modified variants thereof; (3) specifically hybridize 

30 under stringent hybridization conditions to a nucleic acid sequence, or the complement 
thereof of Table 1 or Table 2 and conservatively modified variants thereof or (4) have an 
amino acid sequence that has greater than about 60% amino acid sequence identity, 65%, 
70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% 
or greater amino sequence identity, preferably over a region of over a region of at least about 
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25, 50, 100, 200, 500, 1000, or more amino acid, to an amino acid sequence encoded by a 
nucleotide sequence of or associated with a unigene cluster of Table 1 or Table 2. A 
polynucleotide or polypeptide sequence is typically from a mammal including, but not 
limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, horse, sheep, or 
5 other mammal. A "colorectal cancer polypeptide" and a "colorectal cancer polynucleotide," 
include both naturally occurring or recombinant. 

[52] Homology in this context means sequence similarity or identity, with 
identity being preferred. A preferred comparison for homology purposes is to compare the 
sequence containing sequencing errors to the correct sequence. This homology will be 

1 0 determined using standard techniques known in the art, including, but not limited to, the local 
homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the 
homology alignment algorithm of Needleman & Wunsch, J. Mol. Biool. 48:443 (1970), by 
the search for similarity method of Pearson & Lipman, PNAS USA 85:2444 (1988), by 
computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA 

15 in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Drive, 
Madison, WI), the Best Fit sequence program described by Devereux et al., Nucl. Acid Res. 
12:387-395 (1984), preferably using the default settings, or by inspection. 

[53] In a preferred embodiment, the sequences which are used to determine 
sequence identity or similarity are selected from the sequences set forth in Table 1 or Table 2. 

20 In one embodiment the sequences utilized herein are those set forth in Table 1 or Table 2. In 
another embodiment, the sequences are naturally occurring allelic variants of the sequences 
set forth in Table 1 or Table 2. In another embodiment, the sequences are sequence variants 
as further described herein. 

[54] The terms "identical" or percent identity," in the context of two or 

25 more nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences 
that are the same or have a specified percentage of amino acid residues or nucleotides that are 
the same (i.e., about 60% identity, preferably 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 
94%, 95%, 96%, 97%, 98%, 99%, or higher identity over a specified region, when compared 
and aligned for maximum correspondence over a comparison window or designated region) 

30 as measured using a BLAST or BLAST 2.0 sequence comparison algorithms with default 
parameters described below, or by manual alignment and visual inspection {see, e.g., NCBI 
web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such sequences are then said to 
be "substantially identical." This definition also refers to, or may be applied to, the 
compliment of a test sequence. The definition also includes sequences that have deletions 
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and/or additions, as well as those that have substitutions, as well as naturally occurring, e.g., 
polymorphic or allelic variants, and man-made variants. As described below, the preferred 
algorithms can account for gaps and the like. Preferably, identity exists over a region that is 
at least about 25 amino acids or nucleotides in length, or more preferably over a region that is 
5 50-100 amino acids or nucleotides in length. 

[55] For sequence comparison, typically one sequence acts as a reference 
sequence, to which test sequences are compared. When using a sequence comparison 
algorithm, test and reference sequences are entered into a computer, subsequence coordinates 
are designated, if necessary, and sequence algorithm program parameters are designated. 
1 0 Preferably, default program parameters can be used, or alternative parameters can be 
designated. The sequence comparison algorithm then calculates the percent sequence 
identities for the test sequences relative to the reference sequence, based on the program 
parameters. 

[56] A "comparison window", as used herein, includes reference to a 

15 segment of one of the number of contiguous positions selected from the group consisting 
typically of from 20 to 600, usually about 50 to about 200, more usually about 100 to about 
150 in which a sequence may be compared to a reference sequence of the same number of 
contiguous positions after the two sequences are optimally aligned. Methods of alignment of 
sequences for comparison are well-known in the art. Optimal alignment of sequences for 

20 comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, 
Adv. Appl Math. 2:482 (1981), by the homology alignment algorithm of Needleman & 
Wunsch, J. Mol Biol 48:443 (i970), by the search for similarity method of Pearson & 
Lipman, Proc. Nat 'I Acad. Sci. USA 85:2444 (1988), by computerized implementations of 
these algorithms (GAP, BESTFTT, FASTA, and TFASTA in the Wisconsin Genetics 

25 Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual 
alignment and visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel 
et al, eds. 1995 supplement)). 

[57] Preferred examples of algorithms that are suitable for determining 
percent sequence identity and sequence similarity include the BLAST and BLAST 2.0 

30 algorithms, which are described in Altschul et al, Nuc. Acids Res. 25:3389-3402 (1977) and 
Altschul et al., J. Mol Biol 215:403-410 (1990). BLAST and BLAST 2.0 are used, with the 
parameters described herein, to determine percent sequence identity for the nucleic acids and 
proteins of the invention. Software for performing BLAST analyses is publicly available . 
through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). 
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This algorithm involves first identifying higjh scoring sequence pairs (HSPs) by identifying 
short words of length W in the query sequence, which either match or satisfy some positive- 
valued threshold score T when aligned with a word of the same length in a database 
sequence. T is referred to as the neighborhood word score threshold (Altschul et al. 9 supra). 
5 These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs 
containing them. The word hits are extended in both directions along each sequence for as 
far as the cumulative alignment score can be increased. Cumulative scores are calculated 
using, e.g., for nucleotide sequences, the parameters M (reward score for a pair of matching 
residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino 

10 acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the 
word hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 

1 5 sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 
uses as defaults a wordlength (W) of 1 1, an expectation (E) of 10, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix 
(see Henikoff & Henikoff; Proc. Natl Acad. Set USA 89:10915 (1989)) alignments (B) of 

20 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

[58] The BLAST algorithm also performs a statistical analysis of the 
similarity between two sequences (see, e.g„ Karlin & Altschul, Proc. Natl Acad. Sci. USA 
90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the 
smallest sum probability (P(N)), which provides an indication of the probability by which a 

25 match between two nucleotide or amino acid sequences would occur by chance. For 
example, a nucleic acid is considered similar to a reference sequence if the smallest sum 
probability in a comparison of the test nucleic acid to the reference nucleic acid is less than 
about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001. 
Log values may be large negative numbers, e.g., 5, 10, 20, 30, 40, 40, 70, 90, 1 10, 150, 170, 

30 etc. 

[59] In one embodiment, the nucleic acid homology is determined through 
hybridization studies. Thus, for example, nucleic acids which hybridize under high 
stringency to the nucleic acid sequences which encode the peptides identified in Table 1 or 
Table 2, or their complements, are considered a colorectal cancer sequence. High stringency 
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conditions are known in the art; see for example Maniatis et al., Molecular Cloning: A 
Laboratory Manual, 2d Edition, 1989, and Short Protocols in Molecular Biology, ed. 
Ausubel, et al., both of which are hereby incorporated by reference. Stringent conditions are 
sequence-dependent and will be different in different circumstances. Longer sequences 
5 hybridize specifically at higher temperatures. An extensive guide to the hybridization of 
nucleic acids is found in Tijssen, Techniques in Biochemistry and Molecular Biology- 
Hybridization with Nucleic Acid Probes, "Overview of principles of hybridization and the 
strategy of nucleic acid assays" (1993). Generally, stringent conditions are selected to be 
about 5-10°C lower than the thermal melting point (Tm) for the specific sequence at a 

10 defined ionic strength pH. The Tm is the temperature (under defined ionic strength, pH and 
nucleic acid concentration) at which 50% of the probes complementary to the target hybridize 
to the target sequence at equilibrium (as the target sequences are present in excess, at Tm, 
50% of the probes are occupied at equilibrium). Stringent conditions will be those in which 
the salt concentration is less than about 1.0 M sodium ion, typically about 0.01 to 1.0 M 

15 sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 
30°C for short probes (e.g. 10 to 50 nucleotides) and at least about 60°C for long probes (e.g. 
greater than 50 nucleotides). Stringent conditions may also be achieved with the addition of 
destabilizing agents such as formamide. 

[60] In another embodiment, less stringent hybridization conditions are 

20 used; for example, moderate or low stringency conditions may be used, as are known in the 
art; see Maniatis and Ausubel, supra, and Tijssen, supra. For selective or specific 
hybridization, a positive signal is at least two times background, preferably 10 times 
background hybridization. Exemplary stringent hybridization conditions can be as following: 
50% formamide, 5x SSC, and 1% SDS, incubating at 42°C, or, 5x SSC, 1% SDS, incubating 

25 at 65°C, with wash in 0.2x SSC, and 0.1% SDS at 65°C. 

[61] Nucleic acids that do not hybridize to each other under stringent 
conditions are still substantially identical if the polypeptides which they encode are 
substantially identical. This occurs, for example, when a copy of a nucleic acid is created 
using the maximum codon degeneracy permitted by the genetic code. In such cases, the 

30 nucleic acids typically hybridize under moderately stringent hybridization conditions. 
Exemplary "moderately stringent hybridization conditions" include a hybridization in a 
buffer of 40% formamide, 1 M NaCl, 1% SDS at 37°C, and a wash in IX SSC at 45°C. A 
positive hybridization is at least twice background. Those of ordinary skill will readily 
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recognize that alternative hybridization and wash conditions can be utilized to provide 
conditions of similar stringency. Additional guidelines for determining hybridization 
parameters are provided in numerous reference, e.g., and Current Protocols in Molecular 
Biology, ed. Ausubel, et al 
5 [62] For PCR, a temperature of about 36°C is typical for low stringency 

amplification, although annealing temperatures may vary between about 32°C and 48°C 
depending on primer length. For high stringency PCR amplification, a temperature of about 
62°C is typical, although high stringency annealing temperatures can range from about 50°C 
to about 65°C, depending on the primer length and specificity. Typical cycle conditions for 
10 both high and low stringency amplifications include a denaturation phase of 90°C - 95°C for 
30 sec - 2 min., an annealing phase lasting 30 sec. - 2 min., and an extension phase of about 
72°C for 1 - 2 min. Protocols and guidelines for low and high stringency amplification 
reactions are provided, e.g., in Innis et al, PCR Protocols, A Guide to Methods and 
Applications (1990). 

1 5 [63] In addition, the colorectal cancer nucleic acid sequences of the 

invention are fragments of larger genes, i.e. they are nucleic acid segments. "Genes" in this 
context includes coding regions, non-coding regions, and mixtures of coding and non-coding 
regions. Accordingly, as will be appreciated by those in the art, using the sequences provided 
herein, additional sequences of the colorectal cancer genes can be obtained, using techniques 

20 well known in the art for cloning either longer sequences or the full length sequences; see 
Maniatis et al., and Ausubel, et al., supra, hereby expressly incorporated by reference. 

[64] An indication that two nucleic acid sequences or polypeptides are 
substantially identical is that the polypeptide encoded by the first nucleic acid is 
immunologically cross reactive with the antibodies raised against the polypeptide encoded by 

25 the second nucleic acid. Thus, a polypeptide is typically substantially identical to a second 
polypeptide, e.g., where the two peptides differ only by conservative substitutions. Another 
indication that two nucleic acid sequences are substantially identical is that the two molecules 
or their complements hybridize to each other under stringent conditions, as described above. 
Yet another indication that two nucleic acid sequences are substantially identical is that the 

30 same primers can be used to amplify the sequences. 

[65] Once the colorectal cancer nucleic acid is identified, it can be cloned 
and, if necessary, its constituent parts recombined to form the entire colorectal cancer nucleic 
acid. Once isolated from its natural source, e.g., contained within a plasmid or other vector 
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or excised therefrom as a linear nucleic acid segment, the recombinant colorectal cancer 
nucleic acid can be further-used as a probe to identify and isolate other colorectal cancer 
nucleic acids, for example additional coding regions. It can also be used as a "precursor" 
nucleic acid to make modified or variant colorectal cancer nucleic acids and proteins. 
5 [66] The colorectal cancer nucleic acids of the present invention are used in 

several ways. In a first embodiment, nucleic acid probes to the colorectal cancer nucleic 
acids are made and attached to biochips to be used in screening and diagnostic methods, as 
outlined below, or for administration, for example for gene therapy and/or antisense 
applications. Alternatively, the colorectal cancer nucleic acids that include coding regions of 

1 0 colorectal cancer proteins can be put into expression vectors for the expression of colorectal 
cancer proteins, again either for screening purposes or for administration to a patient. 

[67] In a preferred embodiment, nucleic acid probes to colorectal cancer 
nucleic acids (both the nucleic acid sequences encoding peptides outlined in the Table 1 or 
Table 2 and/or the complements thereof) are made. The nucleic acid probes attached to the 

15 biochip are designed to be substantially complementary to the colorectal cancer nucleic 
acids, Le. the target sequence (either the target sequence of the sample or to other probe 
sequences, for example in sandwich assays), such that hybridization of the target sequence 
and the probes of the present invention occurs. As outlined below, this complementarity need 
not be perfect; there may be any number of base pair mismatches which will interfere with 

20 hybridization between the target sequence and the single stranded nucleic acids of the present 
invention. However, if the number of mutations is so great that no hybridization can occur 
under even the least stringent of hybridization conditions, the sequence is not a 
complementary target sequence. Thus, by "substantially complementary" herein is meant 
that the probes are sufficiently complementary to the target sequences to hybridize under 

25 normal reaction conditions, particularly high stringency conditions, as outlined herein. 

[68] A nucleic acid probe is generally single stranded but can be partially 
single and partially double stranded. The strandedness of the probe is dictated by the 
structure, composition, and properties of the target sequence. In general, the nucleic acid 
probes range from about 8 to about 100 bases long, with from about 10 to about 80 bases 

30 being preferred, and from about 30 to about 50 bases being particularly preferred. That is, 
generally whole genes are not used. In some embodiments, much longer nucleic acids can be 
used, up to hundreds of bases. 

[69] In a preferred embodiment, more than one probe per sequence is used, 
with either overlapping probes or probes to different sections of the target being used. That 
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is, two, three, four or more probes, with three being preferred, are used to build in a 
redundancy for a particular target. The probes can be overlapping (i.e. have some sequence 
in common), or separate. 

[70] As will be appreciated by those in the art, nucleic acids can be 
5 attached or immobilized to a solid support in a wide variety of ways. By "immobilized" and 
grammatical equivalents herein is meant the association or binding between the nucleic acid 
probe and the solid support is sufficient to be stable under the conditions of binding, washing, 
analysis, and removal as outlined below. The binding can be covalent or non-covalent. By 
"non-covalent binding" and grammatical equivalents herein is meant one or more of either 

10 electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is 
the covalent attachment of a molecule, such as, streptavidin to the support and the non- 
covalent binding of the biotinylated probe to the streptavidin. By "covalent binding" and 
grammatical equivalents herein is meant that the two moieties, the solid support and the 
probe, are attached by at least one bond, including sigma bonds, pi bonds and coordination 

15 bonds. Covalent bonds can be formed directly between the probe and the solid support or can 
be formed by a cross linker or by inclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 

[71] In general, the probes are attached to the biochip in a wide variety of 

20 ways, as will be appreciated by those in the art. As described herein, the nucleic acids can 
either be synthesized first, with subsequent attachment to the biochip, or can be directly 
synthesized on the biochip. 

[72] The biochip comprises a suitable solid substrate. By "substrate" or 
"solid support" or other grammatical equivalents herein is meant any material that can be 

25 modified to contain discrete individual sites appropriate for the attachment or association of 
the nucleic acid probes and is amenable to at least one detection method. As will be 
appreciated by those in the art, the number of possible substrates are very large, and include, 
but are not limited to, glass and modified or functionalized glass, plastics (including acrylics, 
polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, 

30 polybutylene, polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, 
silica or silica-based materials including silicon and modified silicon, carbon, metals, 
inorganic glasses, plastics, etc. In general, the substrates allow optical detection and do not 
appreciably fluoresce. A preferred substrate is described in copending application entitled 
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Reusable Low Fluorescent Plastic Biochip, U.S. Application Serial No. 09/270,214, filed 
March 15, 1999, herein incorporated by reference in its entirety. 

[73] Generally the substrate is planar, although as will be appreciated by 
those in the art, other configurations of substrates may be used as well. For example, the 

5 probes may be placed on the inside surface of a tube, for flow-through sample analysis to 
minimize sample volume. Similarly, the substrate may be flexible, such as a flexible foam, 
including closed cell foams made of particular plastics. 

[74] In a preferred embodiment, the surface of the biochip and the probe 
may be derivatized with chemical functional groups for subsequent attachment of the two. 

10 Thus, for example, the biochip is derivatized with a chemical functional group including, but 
not limited to, amino groups, carboxy groups, oxo groups and thiol groups, with amino 
groups being particularly preferred. Using these functional groups, the probes can be 
attached using functional groups on the probes. For example, nucleic acids containing amino 
groups can be attached to surfaces comprising amino groups, for example using linkers as are 

15 known in the art; for example, homo-or hetero-bifunctional linkers as are well known (see 
1994 Pierce Chemical Company catalog, technical section on cross-linkers, pages 155-200, 
incorporated herein by reference). In addition, in some cases, additional linkers, such as 
alkyl groups (including substituted and heteroalkyl groups) may be used. 

[75] In this embodiment, the oligonucleotides are synthesized as is known 

20 in the art, and then attached to the surface of the solid support. As will be appreciated by 
those skilled in the art, either the 5' or 3' terminus may be attached to the solid support, or 
attachment may be via an internal nucleoside. 

[76] In an additional embodiment, the immobilization to the solid support 
may be very strong, yet non-covalent. For example, biotinylated oligonucleotides can be 

25 made, which bind to surfaces covalently coated with streptavidin, resulting in attachment. 

[77] Alternatively, the oligonucleotides may be synthesized on the surface, 
as is known in the art. For example, photoactivation techniques utilizing photopolymerization 
compounds and techniques are used. In a preferred embodiment, the nucleic acids can be 
synthesized in situ, using well known photolithographic techniques, such as those described 

30 inW095/25116; WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references 
cited within, all of which are expressly incorporated by reference; these methods of 
attachment form the basis of the Affimetrix GeneChip™ technology. 
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[78] In a preferred embodiment, colorectal cancer nucleic acids encoding 
colorectal cancer proteins are used to make a variety of expression vectors to express 
colorectal cancer proteins which can then be used in screening assays, as described below. 
The expression vectors may be either self-replicating extrachromosomal vectors or vectors 
5 which integrate into a host genome. Generally, these expression vectors include 

transcriptional and translational regulatory nucleic acid operably linked to the nucleic acid 
encoding the colorectal cancer protein. The term "control sequences" refers to DNA 
sequences necessary for the expression of an operably linked coding sequence in a particular 
host organism. The control sequences that are suitable for prokaryotes, for example, include 

10 a promoter, optionally an operator sequence, and a ribosome binding site. Eukaryotic cells 
are known to utilize promoters, polyadenylalion signals, and enhancers. 

[79] Nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For example, DNA for a presequence or 
secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprdtein 

15 that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked 
to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site 
is operably linked to a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked" means that the DNA sequences being linked are contiguous, 
and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers 

20 do not have to be contiguous. Linking is accomplished by ligation at convenient restriction 
sites. If such sites do not exist, the synthetic oligonucleotide adaptors or linkers are used in 
accordance with conventional practice. The transcriptional and translational regulatory 
nucleic acid will generally be appropriate to the host cell used to express the colorectal cancer 
protein; for example, transcriptional and translational regulatory nucleic acid sequences from 

25 Bacillus are preferably used to express the colorectal cancer protein in Bacillus. Numerous 
types of appropriate expression vectors, and suitable regulatory sequences are known in the 
art for a variety of host cells. 

[80] In general, the transcriptional and translational regulatory sequences 
may include, but are not limited to, promoter sequences, ribosomal binding sites, 

30 transcriptional start and stop sequences, translational start and stop sequences, and enhancer 
or activator sequences. In a preferred embodiment, the regulatory sequences include a 
promoter and transcriptional start and stop sequences. 

[81] Promoter sequences encode either constitutive or inducible promoters. 
The promoters may be either naturally occurring promoters or hybrid promoters. Hybrid 
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promoters, which combine elements of more than one promoter, are also known in the art, 
and are useful in the present invention. 

[82] In addition, the expression vector may comprise additional elements. 
For example, the expression vector may have two replication systems, thus allowing it to be 
5 maintained in two organisms, for example in mammalian or insect cells for expression and in 
a procaryotic host for cloning and amplification. Furthermore, for integrating expression 
vectors, the expression vector contains at least one sequence homologous to the host cell 
genome, and preferably two homologous sequences which flank the expression construct. 
The integrating vector may be directed to a specific locus in the host cell by selecting the 

10 appropriate homologous sequence for inclusion in the vector. Constructs for integrating 
vectors are well known in the art. 

[83] In addition, in a preferred embodiment, the expression vector contains 
a selectable marker gene to allow the selection of transformed host cells. Selection genes are 
well known in the art and will vary with the host cell used. 

15 [84] The colorectal cancer proteins of the present invention are produced 

by culturing a host cell transformed with an expression vector containing nucleic acid 
encoding a colorectal cancer protein, under the appropriate conditions to induce or cause 
expression of the colorectal cancer protein. The conditions appropriate for colorectal cancer 
protein expression will vary with the choice of the expression vector and the host cell, and 

20 will be easily ascertained by one skilled in the art through routine experimentation. For 
example, the use of constitutive promoters in the expression vector will require optimizing 
the growth and proliferation of the host cell, while the use of an inducible promoter requires 
the appropriate growth conditions for induction. In addition, in some embodiments, the 
timing of the harvest is important. For example, the baculoviral systems used in insect cell 

25 expression are lytic viruses, and thus harvest time selection can be crucial for product yield. 

[85] Appropriate host cells include yeast, bacteria, archaebacteria, fungi, 
and insect and animal cells, including mammalian cells. Of particular interest are Drosophila 
melangaster cells, Saccharomyces cerevisiae and other yeasts, E. coli, Bacillus subtilis, Sf9 
cells, C129 cells, 293 cells, Neurospora, BHK, CHO, COS, HeLa cells, THP1 cell line (a 

30 macrophage cell line) and human cells and cell lines. 

[86] In a preferred embodiment, the colorectal cancer proteins are 
expressed in mammalian cells. Mammalian expression systems are also known in the art, and 
include retroviral systems. A preferred expression vector system is a retroviral vector system 
such as is generally described in PCT/US97/01019 and PCT/US97/01048, both of which are 
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hereby expressly incorporated by reference. Of particular use as mammalian promoters are 
the promoters from mammalian viral genes, since the viral genes are often highly expressed 
and have a broad host range. Examples include the S V40 early promoter, mouse mammary 
tumor virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, 
5 and the CMV promoter. Typically, transcription termination and polyadenylation sequences 
recognized by mammalian cells are regulatory regions located 3 1 to the translation stop codon 
and thus, together with the promoter elements, flank the coding sequence. Examples of 
transcription terminator and polyadenlytion signals include those derived form SV40. 

[87] The methods of introducing exogenous nucleic acid into mammalian 
1 0 hosts, as well as other hosts, is well known in the art, and will vary with the host cell used. 
Techniques include dextran-mediated transfection, calcium phosphate precipitation, 
polybrene mediated transfection, protoplast fusion, electroporation, viral infection, 
encapsulation of the polynucleotide^) in liposomes, and direct microinjection of the DNA 
into nuclei. 

15 [88] In a preferred embodiment, colorectal cancer proteins are expressed in 

bacterial systems. Bacterial expression systems are well known in the art. Promoters from 
bacteriophage may also be used and are known in the art. In addition, synthetic promoters 
and hybrid promoters are also useful; for example, the tac promoter is a hybrid of the top and 
lac promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 

20 promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome 
binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the colorectal cancer protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 

25 between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
expression vector may also include a selectable marker gene to allow for the selection of 
bacterial strains that have been transformed. Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, 

30 such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 

components are assembled into expression vectors. Expression vectors for bacteria are well 
known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, 
and Streptococcus lividans, among others. The bacterial expression vectors are transformed 
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into bacterial host cells using techniques well known in the art, such as calcium chloride 
treatment, electroporation, and others. 

[89] In one embodiment, colorectal cancer proteins are produced in insect 
cells. Expression vectors for the transformation of insect cells, and in particular, baculovirus- 
5 based expression vectors, are well known in the art. 

[90] In a preferred embodiment, colorectal cancer protein is produced in 
yeast cells. Yeast expression systems are well known in the art, and include expression 
vectors for Saccharomyces cerevisiae, Candida albicans and C. maltosa, Hansenula 
polymorpha, Kluyveromyces fragilis and K. lactis, Pichia guillerimondii and P. pastoris, 

1 0 Schizosaccharomyces pombe, and Yarrowia lipolytica. 

[91] The colorectal cancer protein may also be made as a fusion protein, 
using techniques well known in the art. Thus, for example, for the creation of monoclonal 
antibodies, if the desired epitope is small, the colorectal cancer protein may be fused to a 
carrier protein to form an immunogen. Alternatively, the colorectal cancer protein may be 

15 made as a fusion protein to increase expression, or for other reasons. For example, when the 
colorectal cancer protein is a colorectal cancer peptide, the nucleic acid encoding the peptide 
may be linked to other nucleic acid for expression purposes. 

[92] In one embodiment, the colorectal cancer nucleic acids, proteins and 
antibodies of the invention are labeled. By "labeled" herein is meant that a compound has at 

20 least one element, isotope or chemical compound attached to enable the detection of the 
compound. In general, labels fall into three classes: a) isotopic labels, which may be 
radioactive or heavy isotopes; b) immune labels, which may be antibodies or antigens; and c) 
colored or fluorescent dyes. The labels may be incorporated into the colorectal cancer 
nucleic acids, proteins and antibodies at any position. For example, the label should be 

25 capable of producing, either directly or indirectly, a detectable signal. The detectable moiety 
may be a radioisotope, such as 3H, 14C, 32P, 35S, or 1251, a fluorescent or 
chemiluminescent compound, such as fluorescein isothiocyanate, rhodamine, or luciferin, or 
an enzyme, such as alkaline phosphatase, beta-galactosidase or horseradish peroxidase. Any 
method known in the art for conjugating the antibody to the label may be employed, 

30 including those methods described by Hunter et aL, Nature, 144:945 (1962); David et al., 
Biochemistry, 13:1014 (1974); Pain et al., J. Immunol. Meth., 40:219 (1981); andNygren, J. 
Histochem. and Cytochem., 30:407 (1982). 

[93] Accordingly, the present invention also provides colorectal cancer 
protein sequences. A colorectal cancer protein of the present invention may be identified in 
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several ways. "Protein" in this sense includes proteins, polypeptides, and peptides terms 
which are used interchangeably herein to refer to a polymer of amino acid residues. The 
terms apply to amino acid polymers in which one or more amino acid residue is an artificial 
chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally 
5 occurring amino acid polymers, those containing modified residues, and non-naturally 
occurring amino acid polymer. 

[94] As will be appreciated by those in the art, the nucleic acid sequences 
of the invention can be used to generate protein sequences. There are a variety of ways to do 
this, including cloning the entire gene and verifying its frame and amino acid sequence, or by 

1 0 comparing it to known sequences to search for homology to provide a frame, assuming the 
colorectal cancer protein has homology to some protein in the database being used. 
Generally, the nucleic acid sequences are input into a program that will search all three 
frames for homology. This is done in a preferred embodiment using the following NCBI 
Advanced BLAST parameters. The program is blastx or blastn. The database is nr. The 

15 input data is as "Sequence in FASTA format". The organism list is "none". The "expect" is 
10; the filter is default. The "descriptions" is 500, the "alignments" is 500, and the 
"alignment view" is pairwise. The "Query Genetic Codes" is standard (1). The matrix is 
BLOSUM62; gap existence cost is 11, per residue gap cost is 1; and the lambda ratio is .85 
default. This results in the generation of a putative protein sequence. 

20 [95] Also included within one embodiment of colorectal cancer proteins 

are amino acid variants of the naturally occurring sequences, as determined herein. 
Preferably, the variants are preferably greater than about 75% homologous to the wild-type 
sequence, more preferably greater than about 80%, even more preferably greater than about 
85% and most preferably greater than 90%. In some embodiments the homology will be as 

25 high as about 93 to 95 or 98%. As for nucleic acids, homology in this context means 
sequence similarity or identity, with identity being preferred. This homology will be 
determined using standard techniques known in the art as are outlined above for the nucleic 
acid homologies. 

[96] Colorectal cancer proteins of the present invention may be shorter or 
30 longer than the wild type amino acid sequences. Thus, in a preferred embodiment, included 
within the definition of colorectal cancer proteins are portions or fragments of the wild type 
sequences, herein. In addition, as outlined above, the colorectal cancer nucleic acids of the 
invention may be used to obtain additional coding regions, and thus additional protein 
sequence, using techniques known in the art. 
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[97] In a preferred embodiment, the colorectal cancer proteins are 
derivative or variant colorectal cancer proteins as compared to the wild-type sequence. That 
is, as outlined more fully below, the derivative colorectal cancer peptide will contain at least 
one amino acid substitution, deletion or insertion, with amino acid substitutions being 
5 particularly preferred. The amino acid substitution, insertion or deletion may occur at any 
residue within the colorectal cancer peptide. 

[98] Also included in an embodiment of colorectal cancer proteins of the 
present invention are amino acid sequence variants. These variants fall into one or more of 
three classes: substitutional, insertional or deletional variants. These variants ordinarily are 

10 prepared by site specific mutagenesis of nucleotides in the DNA encoding the colorectal 

cancer protein, using cassette or PCR mutagenesis or other techniques well known in the art, 
to produce DNA encoding the variant, and thereafter expressing the DNA in recombinant cell 
culture as outlined above. However, variant colorectal cancer protein fragments having up to 
about 100-150 residues may be prepared by in vitro synthesis using established techniques. 

1 5 Amino acid sequence variants are characterized by the predetermined nature of the variation, 
a feature that sets them apart from naturally occurring allelic or interspecies variation of the 
colorectal cancer protein amino acid sequence. The variants typically exhibit the same 
qualitative biological activity as the naturally occurring analogue, although variants can also 
be selected which have modified characteristics as will be more fully outlined below. 

20 [99] While the site or region for introducing an amino acid sequence 

variation is predetermined, the mutation per se need not be predetermined. For example, in 
order to optimize the performance of a mutation at a given site, random mutagenesis may be 
conducted at the target codon or region and the expressed colorectal cancer variants screened 
for the optimal combination of desired activity. Techniques for making substitution 

25 mutations at predetermined sites in DNA having a known sequence are well known, for 
example, Ml 3 primer mutagenesis and PCR mutagenesis. Screening of the mutants is done 
using assays of colorectal cancer protein activities. 

[100] Amino acid substitutions are typically of single residues; insertions 
usually will be on the order of from about 1 to 20 amino acids, although considerably larger 

30 insertions may be tolerated. Deletions range from about 1 to about 20 residues, although in 
some cases deletions may be much larger. 

[101] Substitutions, deletions, insertions or any combination thereof may be 
used to arrive at a final derivative. Generally these changes are done on a few amino acids to 
minimize the alteration of the molecule. However, larger changes may be tolerated in certain 
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circumstances. When small alterations in the characteristics of the colorectal cancer protein 
are desired, substitutions are generally made in accordance with the following chart: 

Chart I 

Original Residue Exemplary Substitutions 

5 



Ala 


Ser 


Arg 


Lys 


Asn 


Gin, His 


Asp 


Glu 


Cys 


Ser 


Gin 


Asn 


Glu 


Asp 


Gly 


Pro 


His 


Asn, Gin 


lie 


Leu, Val 


Leu 


He, Val 


Lys 


Arg, Gin, Glu 


Met 


Leu, lie 


Phe 


Met, Leu, Tyr 


Ser 


Thr 


Thr 


Ser 


Tip 


Tyr 


Tyr 


Trp, Phe 


Val 


lie, Leu 



25 

[102] Substantial changes in function or immunological identity are made by 
selecting substitutions that are less conservative than those shown in Chart I. For example, 
substitutions may be made which more significantly affect: the structure of the polypeptide 
backbone in the area of the alteration, for example the alpha-helical or beta-sheet structure; 
30 the charge or hydrophobicity of the molecule at the target site; or the bulk of the side chain. 
The substitutions which in general are expected to produce the greatest changes in the 
polypeptides properties are those in which (a) a hydrophilic residue, e.g. seryl or threonyl is 
substituted for (or by) a hydrophobic residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or 
alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue 
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having an electropositive side chain, e.g. lysyl, arginyl, or histidyl, is substituted for (or by) 
an electronegative residue, e.g. glutamyl or aspartyl; or (d) a residue having a bulky side 
chain, e.g. phenylalanine, is substituted for (or by) one not having a side chain, e.g. glycine. 

[103] The variants typically exhibit the same qualitative biological activity 
5 and will elicit the same immune response as the naturally-occurring analogue, although 
variants also are selected to modify the characteristics of the colorectal cancer proteins as 
needed. Alternatively, the variant may be designed such that the biological activity of the 
colorectal cancer protein is altered. For example, glycosylation sites may be altered or 
removed. 

10 [104] Covalent modifications of colorectal cancer polypeptides are included 

within the scope of this invention. One type of covalent modification includes reacting 
targeted amino acid residues of a colorectal cancer polypeptide with an organic derivatizing 
agent that is capable of reacting with selected side chains or the N-or C-terminal residues of a 
colorectal cancer polypeptide. Derivatization with bifimctional agents is useful, for instance, 

15 for crosslinking colorectal cancer to a water-insoluble support matrix or surface for use in 
the method for purifying anti-colorectal cancer antibodies or screening assays, as is more 
fully described below. Commonly used crosslinking agents include, e.g., l,l-bis(diazo- 
acetyl)-2-phenylethane, glutaraldehyde, N-hydroxy-succinimide esters, for example, esters 
with 4-azido-salicylic acid, homobifiinctional imidoesters, including disuccinimidyl esters 

20 such as S^'-dithiobis-fsuccinimidyl-propionate), bifimctional maleimides such as bis-N- 
maleimido-l,8-octane and agents such as methyl-3-[(p-azidophenyl)-dithio]pro-pioimi-date. 

[1 05] Other modifications include deamidation of glutaminyl and 
asparaginyl residues to the corresponding glutamyl and aspartyl residues, respectively, 
hydroxylation of proline and lysine, phosphorylation of hydroxyl groups of seryl, threonyl or 

25 tyrosyl residues, methylation of the a-amino groups of lysine, arginine, and histidine side 
chains [T.E. Creighton, Proteins: Structure and Molecular Properties, W.H. Freeman & Co., 
San Francisco, pp. 79-86 (1983)], acetylation of the N-terminal amine, and amidation of any 
C-terminal carboxyl group. 

[1 06] Another type of covalent modification of the colorectal cancer 

30 polypeptide included within the scope of this invention comprises altering the native 
glycosylation pattern of the polypeptide. "Altering the native glycosylation pattern" is 
intended for purposes herein to mean deleting one or more carbohydrate moieties found in 
native sequence colorectal cancer polypeptide, and/or adding one or more glycosylation sites 
that are not present in the native sequence colorectal cancer polypeptide. 

30 
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[1071 Addition of glycosylation sites to colorectal cancer polypeptides may 
be accomplished by altering the amino acid sequence thereof. The alteration may be made, 
for example, by the addition of, or substitution by, one or more serine or threonine residues to 
the native sequence colorectal cancer polypeptide (for O-linked glycosylation sites). The 
5 colorectal cancer amino acid sequence may optionally be altered through changes at the 
DNA level, particularly by mutating the DNA encoding the colorectal cancer polypeptide at 
preselected bases such that codons are generated that will translate into the desired amino 
acids. 

[108] Another means of increasing the number of carbohydrate moieties on 
10 the colorectal cancer polypeptide is by chemical or enzymatic coupling of glycosides to the 
polypeptide. Such methods are described in the art, e.g., in WO 87/05330 published 1 1 
September 1987, and in Aplin and Wriston, colorectal cancer Crit. Rev. Biochem., pp. 259- 
306 (1981). 

[109] Removal of carbohydrate moieties present on the colorectal cancer 

1 5 polypeptide may be accomplished chemically or enzymatically or by mutational substitution 
of codons encoding for amino acid residues that serve as targets for glycosylation. Chemical 
deglycosylation techniques are known in the art and described, for instance, by Hakimuddin, 
et al., Arch. Biochem. Biophys., 259:52 (1987) and by Edge et al, Anal. Biochem., 118:131 
(1981). Enrymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the 

20 use of a variety of endo-and exo-glycosidases as described by Thotakura et al., Meth. 
Enzymol., 138:350 (1987). 

[110] Another type of covalent modification of colorectal cancer comprises 
linking the colorectal cancer polypeptide to one of a variety of nonproteinaceous polymers, 
e.g., polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth 

25 in U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337. 

[Ill] colorectal cancer polypeptides of the present invention may also be 
modified in a way to form chimeric molecules comprising a colorectal cancer polypeptide 
fused to another, heterologous polypeptide or amino acid sequence. In one embodiment, such 
a chimeric molecule comprises a fusion of a colorectal cancer polypeptide with a tag 

30 polypeptide which provides an epitope to which an anti-tag antibody can selectively bind. 
The epitope tag is generally placed at the amino-or carboxyl-terminus of the colorectal cancer 
polypeptide. The presence of such epitope-tagged forms of a colorectal cancer polypeptide 
can be detected using an antibody against the tag polypeptide. Also, provision of the epitope 
tag enables the colorectal cancer polypeptide to be readily purified by affinity purification 
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using an anti-tag antibody or another type of affinity matrix that binds to the epitope tag. In 
an alternative embodiment, the chimeric molecule may comprise a fusion of a colorectal 
cancer polypeptide with an immunoglobulin or a particular region of an immunoglobulin. 
For a bivalent form of the chimeric molecule, such a fusion could be to the Fc region of an 
5 IgG molecule. 

[112] Various tag polypeptides and their respective antibodies are well - 
known in the art. Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly- 
his-gly) tags; the flu HA tag polypeptide and its antibody 12CA5 [Field et al., Mol. Cell. 
Biol., 8:2159-2165 (1988)]; the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 9E10 

10 antibodies thereto [Evan et al., Molecular and Cellular Biology, 5:3610-3616 (1985)]; and the 
Herpes Simplex virus glycoprotein D (gD) tag and its antibody [Paborsky et al., Protein 
Engineering, 3(6):547-553 (1990)]. Other tag polypeptides include the Flag-peptide [Hopp et 
al., BioTechnology, 6:1204-1210 (1988)]; the KT3 epitope peptide [Martin et aL, Science, 
255:192-194 (1992)]; tubulin epitope peptide [Skinner et al., J. Biol. Chem., 266:15163- 

15 15166 (1991)]; and the T7 gene 10 protein peptide tag [Lutz-Freyermuth et al., Proc. Natl. 
Acad. Sci. USA, 87:6393-6397 (1990)]. 

[113] Also included with the definition of colorectal cancer protein in one 
embodiment are other colorectal cancer proteins of the colorectal cancer family, and 
colorectal cancer proteins from other organisms, which are cloned and expressed as outlined 

20 below. Thus, probe or degenerate polymerase chain reaction (PCR) primer sequences may be 
used to find other related colorectal cancer proteins from humans or other organisms. As 
will be appreciated by those in the art, particularly useful probe and/or PCR primer sequences 
include the unique areas of the colorectal cancer nucleic acid sequence. As is generally 
known in the art, preferred PCR primers are from about 15 to about 35 nucleotides in length, 

25 with from about 20 to about 30 being preferred, and may contain inosine as needed. The 
conditions for the PCR reaction are well known in the art. 

[114] In addition, as is outlined herein, colorectal cancer proteins can be 
made that are longer than those depicted in the Table 1 or Table 2 for example, by the 
elucidation of additional sequences, the addition of epitope or purification tags, the addition 

30 of other fusion sequences, etc. 

[115] Colorectal cancer proteins may also be identified as being encoded by 
colorectal cancer nucleic acids. Thus, colorectal cancer proteins are encoded by nucleic 
acids that will hybridize to the sequences of the sequence listings, or their complements, as 
outlined herein. 
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[116] In a preferred embodiment, when the colorectal cancer protein is to be 
used to generate antibodies, for example for immunotherapy, the colorectal cancer protein 
should share at least one epitope or determinant with the full length protein. By "epitope" or 
"determinant" herein is meant a portion of a protein which will generate and/or bind an 
5 antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies made 
to a smaller colorectal cancer protein will be able to bind to the fiill length protein. In a 
preferred embodiment, the epitope is unique; that is, antibodies generated to a unique epitope 
show little or no cross-reactivity. In a preferred embodiment, the epitope is selected from a 
peptide encoded by a nucleic acid of Tablel. In another preferred embodiment, the epitope is 

1 0 selected from the CBF9 peptide sequence shown in Table 2. 

[117J In one embodiment, the term "antibody" includes antibody fragments, 
as are known in the art, including Fab, Fab2, single chain antibodies (Fv for example), 
chimeric antibodies, etc., either produced by the modification of whole antibodies or those 
synthesized de novo using recombinant DNA technologies. 

IS [118] Methods of preparing polyclonal antibodies are known to the skilled 

artisan. Polyclonal antibodies can be raised in a mammal, for example, by one or more 
injections of an immunizing agent and, if desired, an adjuvant. Typically, the immunizing 
agent and/or adjuvant will be injected in the mammal by multiple subcutaneous or 
intraperitoneal injections. The immunizing agent may include the CBF9 peptide of Table 2, 

20 or a peptide encoded by a nucleic acid of Table 1 or fragment thereof or a fusion protein 
thereof. It may be useful to conjugate the immunizing agent to a protein known to be 
immunogenic in the mammal being immunized Examples of such immunogenic proteins 
include but are not limited to keyhole limpet hemocyanin, serum albumin, bovine 
thyroglobulin, and soybean trypsin inhibitor. Examples of adjuvants which may be employed 

25 include Freund's complete adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, 
synthetic trehalose dicorynomycolate). The immunization protocol may be selected by one 
skilled in the art without undue experimentation. 

[119] The antibodies may, alternatively, be monoclonal antibodies. 
Monoclonal antibodies may be prepared using hybridoma methods, such as those described 

30 by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, hamster, 
or other appropriate host animal, is typically immunized with an immunizing agent to elicit 
lymphocytes that produce or are capable of producing antibodies that will specifically bind to 
the immunizing agent. Alternatively, the lymphocytes may be immunized in vitro. The 
immunizing agent will typically include the CBF9 polypeptide or a peptide encoded by a 
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nucleic acid of Table 1 or a fragment thereof or a fusion protein thereof. Generally, either 
peripheral blood lymphocytes ("PBLs") are used if cells of human origin are desired, or 
spleen cells or lymph node cells are used if non-human mammalian sources are desired The 
lymphocytes are then fused with an immortalized cell line using a suitable fusing agent, such 
5 as polyethylene glycol, to fonn a hybridoma cell [Goding, Monoclonal Antibodies: Principles 
and Practice, Academic Press, (1986) pp. 59-103], Immortalized cell lines are usually 
transformed mammalian cells, particularly myeloma cells of rodent, bovine and human 
origin. Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells may be 
cultured in a suitable culture medium that preferably contains one or more substances that 

10 inhibit the growth or survival of the unfiised, immortalized cells. For example, if the parental 
cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), 
the culture medium for the hybridomas typically will include hypoxanthine, aminopterin, and 
thymidine ("HAT medium"), which substances prevent the growth of HGPRT-deficient cells. 
[120] In one embodiment, the antibodies are bispecific antibodies. 

1 5 Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that have 
binding specificities for at least two different antigens. In the present case, one of the binding 
specificities is for a colorectal cancer protein or a fragment thereof, the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. 

20 [121] In a preferred embodiment, the antibodies to colorectal cancer are 

capable of reducing or eliminating the biological function of colorectal cancer , as is 
described below. That is, the addition of anti-colorectal cancer antibodies (either polyclonal 
or preferably monoclonal) to colorectal cancer (or cells containing colorectal cancer ) may 
reduce or eliminate the colorectal cancer activity. Generally, at least a 25% decrease in 

25 activity is preferred, with at least about 50% being particularly preferred and about a 95- 
100% decrease being especially preferred. 

[122] In a preferred embodiment the antibodies to the colorectal cancer 
proteins are humanized antibodies. Humanized forms of non-human (e.g., murine) antibodies 
are chimeric molecules of immunoglobulins, immunoglobulin chains or fragments thereof 

30 (such as Fv, Fab, Fab 1 , F(ab*)2 or other antigen-binding subsequences of antibodies) which 
contain minimal sequence derived from non-human immunoglobulin. Humanized antibodies 
include human immunoglobulins (recipient antibody) in which residues form a 
complementary determining region (CDR) of the recipient are replaced by residues from a 
CDR of a non-human species (donor antibody) such as mouse, rat or rabbit having the desired 
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specificity, affinity and capacity. In some instances, Fv framework residues of the human 
immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, the humanized antibody will comprise 
5 substantially all of at least one, and typically two, variable domains, in which all or 

substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the FR regions are those of a human immunoglobulin consensus 
sequence. The humanized antibody optimally also will comprise at least a portion of an 
immunoglobulin constant region (Fc), typically that of a human immunoglobulin [Jones et 

10 al., Nature, 321 :522-525 (1986); Riechmann et aL, Nature, 332:323-329 (1988); and Presta, 
Curr. Op. Struct. Biol., 2:593-596 (1992)]. 

[123] Methods for humanizing non-human antibodies are well known in the 
art. Generally, a humanized antibody has one or more amino acid residues introduced into it 
from a source which is non-human. These non-human amino acid residues are often referred 

15 to as import residues, which are typically taken from an import variable domain. 

Humanization can be essentially performed following the method of Winter and co-workers 
[Jones et al., Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988); 
Verhoeyen et al., Science, 239:1534-1536 (1988)], by substituting rodent CDRs or CDR 
sequences for the corresponding sequences of a human antibody. Accordingly, such 

20 humanized antibodies are chimeric antibodies (U.S. Patent No. 4,8 1 6,567), wherein 
substantially less than an intact human variable domain has been substituted by the 
corresponding sequence from anon-human species. In practice, humanized antibodies are 
typically human antibodies in which some CDR residues and possibly some FR residues are 
substituted by residues from analogous sites in rodent antibodies. 

25 [124] Human antibodies can also be produced using various techniques 

known in the art, including phage display libraries [Hoogenboom and Winter, J. MoL Biol., 
227:381 (1991); Marks et al, J. Mol. Biol, 222:581 (1991)]. The techniques of Cole et al. 
and Boerner et al. are also available for the preparation of human monoclonal antibodies 
(Cole et al. Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, p. 77 (1985) and 

30 Boerner et al, J. Immunol, 147(l):86-95 (1991)]. Similarly, human antibodies can be made 
by introducing of human immunoglobulin loci into transgenic animals, e.g., mice in which 
the endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in 
humans in all respects, including gene rearrangement, assembly, and antibody repertoire. 
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This approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 
5,569,825; 5,625,126; 5,633,425; 5,661,016, and in the following scientific publications: 
Marks et al., Bio/Technology 10, 779-783 (1992); Lonberg et al., Nature 368 856-859 (1994); 
Morrison, Nature 368, 812-13 (1994); Fishwild et al., Nature Biotechnology 14, 845-51 
5 (1996); Neuberger, Nature Biotechnology 14, 826 (1996); Lonberg and Huszar, Intern. Rev. 
Immunol. 13 65-93 (1995). 

[125] By immunotherapy is meant treatment of colorectal cancer with an 
antibody raised against colorectal cancer proteins. As used herein, immunotherapy can be 
passive or active. Passive immunotherapy as defined herein is the passive transfer of 

10 antibody to a recipient (patient). Active immunization is the induction of antibody and/or T- 
cell responses in a recipient (patient). Induction of an immune response is the result of 
providing the recipient with an antigen to which antibodies are raised. As appreciated by one 
of ordinary skill in the art, the antigen may be provided by injecting a polypeptide against 
which antibodies are desired to be raised into a recipient, or contacting the recipient with a 

15 nucleic acid capable of expressing the antigen and under conditions for expression of the 
antigen. 

[126] In a preferred embodiment the colorectal cancer proteins against 
which antibodies are raised are secreted proteins as described above. Without being bound 
by theory, antibodies used for treatment, bind and prevent the secreted protein from binding 

20 to its receptor, thereby inactivating the secreted colorectal cancer protein. 

[127] In another preferred embodiment, the colorectal cancer protein to 
which antibodies are raised is a transmembrane protein. Without being bound by theory, 
antibodies used for treatment, bind the extracellular domain of the colorectal cancer protein 
and prevent it from binding to other proteins, such as circulating ligands or cell-associated 

25 molecules. The antibody may cause down-regulation of the transmembrane colorectal cancer 
protein. As will be appreciated by one of ordinary skill in the art, the antibody may be a 
competitive, non-competitive or uncompetitive inhibitor of protein binding to the 
extracellular domain of the colorectal cancer protein. The antibody is also an antagonist of 
the colorectal cancer protein. Further, the antibody prevents activation of the transmembrane 

30 colorectal cancer protein. In one aspect, when the antibody prevents the binding of other 
molecules to the colorectal cancer protein, the antibody prevents growth of the cell. The 
antibody also sensitizes the cell to cytotoxic agents, including, but not limited to TNF-cc, 
TNF-P, DL-1, INF-y and IL-2, or chemotherapeutic agents including 5FU, vinblastine, 
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actinomycin D, cisplatin, methotrexate, and the like. In some instances the antibody belongs 
to a sub-type that activates serum complement when complexed with the transmembrane 
protein thereby mediating cytotoxicity. Thus, colorectal cancer is treated by administering to 
a patient antibodies directed against the transmembrane colorectal cancer protein. 
5 [128] In another preferred embodiment, the antibody is conjugated to a 

therapeutic moiety. In one aspect the therapeutic moiety is a small molecule that modulates 
the activity of the colorectal cancer protein. In another aspect the therapeutic moiety 
modulates the activity of molecules associated with or in close proximity to the colorectal 
cancer protein. The therapeutic moiety may inhibit enzymatic activity such as protease or 

1 0 protein kinase activity associated with colorectal cancer . 

[129] In a preferred embodiment, the therapeutic moiety may also be a 
cytotoxic agent In this method, targeting the cytotoxic agent to tumor tissue or cells, results 
in a reduction in the number of afflicted cells, thereby reducing symptoms associated with 
colorectal cancer . Cytotoxic agents are numerous and varied and include, but are not limited 

15 to, cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 
corresponding fragments include diptheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotdn, phenomycin, enomycin and the like. Cytotoxic agents also include 
radiochemicals made by conjugating radioisotopes to antibodies raised against colorectal 
cancer proteins, or binding of a radionuclide to a chelating agent that has been covalently 

20 attached to the antibody. Targeting the therapeutic moiety to transmembrane colorectal 
cancer proteins not only serves to increase the local concentration of therapeutic moiety in 
the colorectal cancer afflicted area, but also serves to reduce deleterious side effects that may 
be associated with the therapeutic moiety. 

[130] In another preferred embodiment, the colorectal cancer protein against 

25 which the antibodies are raised is an intracellular protein. In this case, the antibody may be 
conjugated to a protein which facilitates entry into the cell. In one case, the antibody enters 
the cell by endocytosis. In another embodiment, a nucleic acid encoding the antibody is 
administered to the individual or cell. Moreover, wherein the colorectal cancer protein can 
be targeted within a cell, i.e., the nucleus, an antibody thereto contains a signal for that target 

30 localization, i.e., a nuclear localization signal. 

[131] The colorectal cancer antibodies of the invention specifically bind to 
colorectal cancer proteins. By "specifically bind" herein is meant that the antibodies bind to 
the protein with a binding constant in the range of at least 10^- 10" 6 M" 1 , with a preferred 
range being 10** 7 - 10" 9 M~\ 
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[132] In a preferred embodiment, the colorectal cancer protein is purified or 
isolated after expression. Colorectal cancer proteins may be isolated or purified in a variety 
of ways known to those skilled in the art depending on what other components are present in 
the sample. Standard purification methods include electrophoretic, molecular, 
5 immunological and chromatographic techniques, including ion exchange, hydrophobic, 
affinity, and reverse-phase HPLC chromatography, and chromatofocusing. For example, the 
colorectal cancer protein may be purified using a standard anti-colorectal cancer antibody 
column. Ultrafiltration and diafiltration techniques, in conjunction with protein 
concentration, are also useful. For general guidance in suitable purification techniques, see 

10 Scopes, R., Protein Purification, Springer-Verlag, NY (1982). The degree of purification 
necessary will vary depending on the use of the colorectal cancer protein. In some instances 
no purification will be necessary. 

[133] Once expressed and purified if necessary, the colorectal cancer 
proteins and nucleic acids are useful in a number of applications. 

15 [134] In one aspect, the expression levels of genes are determined for 

different cellular states in the colorectal cancer phenotype; that is, the expression levels of 
genes in normal colon tissue and in colorectal cancer tissue (and in some cases, for varying 
severities of colorectal cancer that relate to prognosis, as outlined below) are evaluated to 
provide expression profiles. An expression profile of a particular cell state or point of 

20 development is essentially a "fingerprint" of the state; while two states may have any 
particular gene similarly expressed, the evaluation of a number of genes simultaneously 
allows the generation of a gene expression profile that is unique to the state of the cell. By 
comparing expression profiles of cells in different states, information regarding which genes 
are important (including both up- and down-regulation of genes) in each of these states is 

25 obtained. Then, diagnosis may be done or confirmed: does tissue from a particular patient 
have the gene expression profile of normal or colorectal cancer tissue. 

[135] "Differential expression," or grammatical equivalents as used herein, 
refers to both qualitative as well as quantitative differences in the genes' temporal and/or 
cellular expression patterns within and among the cells. Thus, a differentially expressed gene 

30 can qualitatively have its expression altered, including an activation or inactivation, in, for 
example, normal versus colorectal cancer tissue. That is, genes may be turned on or turned 
off in a particular state, relative to another state. As is apparent to the skilled artisan, any 
comparison of two or more states can be made. Such a qualitatively regulated gene will 
exhibit an expression pattern within a state or cell type which is detectable by standard 
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techniques in one such state or cell type, but is not detectable in both. Alternatively, the 
determination is quantitative in that expression is increased or decreased; that is, the 
expression of the gene is either upregulated, resulting in an increased amount of transcript, or 
downregulated, resulting in a decreased amount of transcript. The degree to which 
5 expression differs need only be large enough to quantify via standard characterization 
techniques as outlined below, such as by use of Afiymetrix GeneChip™ expression arrays, 
Lockhart, Nature Biotechnology, 14:1675-1680 (1996), hereby expressly incorporated by 
reference. Other techniques include, but are not limited to, quantitative reverse transcriptase 
PCR, Northern analysis and RNase protection. As outlined above, preferably the change in 

10 expression (i.e. upregulation or downregulation) is at least about 50%, more preferably at 
least about 100%, more preferably at least about 150%, more preferably, at least about 200%, 
with from 300 to at least 1000% being especially preferred. 

[136] As will be appreciated by those in the art, this may be done by 
evaluation at either the gene transcript, or the protein level; that is, the amount of gene 

1 5 expression may be monitored using nucleic acid probes to the DNA or RNA equivalent of the 
gene transcript, and the quantification of gene expression levels, or, alternatively, the final 
gene product itself (protein) can be monitored, for example through the use of antibodies to 
the colorectal cancer protein and standard immunoassays (ELISAs,e to.) or other techniques, 
including mass spectroscopy assays, 2D gel electrophoresis assays, etc. Thus, the proteins 

20 corresponding to colorectal cancer genes, i.e. those identified as being important in a 
colorectal cancer phenotype, can be evaluated in a colorectal cancer diagnostic test. 

[137] In a preferred embodiment, gene expression monitoring is done and a 
number of genes, i.e. an expression profile, is monitored simultaneously, although multiple 
protein expression monitoring can be done as well. Similarly, these assays may be done on 

25 an individual basis as well. 

[138] In this embodiment, the colorectal cancer nucleic acid probes are 
attached to biochips as outlined herein for the detection and quantification of colorectal 
cancer sequences in a particular cell. The assays are further described below in the example. 
[139] In a preferred embodiment nucleic acids encoding the colorectal 

30 cancer protein are detected. Although DNA or RNA encoding the colorectal cancer protein 
may be detected, of particular interest are methods wherein the mRNA encoding a colorectal 
cancer protein is detected. The presence of mRNA in a sample is an indication that the 
colorectal cancer gene has been transcribed to form the mRNA, and suggests that the protein 
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is expressed. Probes to detect the mRNA can be any nucleotide/deoxynucleotide probe that 
is complementary to and base pairs with the mRNA and includes but is not limited to 
oligonucleotides, cDNA or RNA. Probes also should contain a detectable label, as defined 
herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 
5 examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specifically bound probe, the label is 
detected. In another method detection of the mRNA is performed in situ. In this method 
permeabilized cells or tissue samples are contacted with a detectably labeled nucleic acid 
probe for sufficient time to allow the probe to hybridize with the target mRNA. Following 
10 washing to remove the non-specifically bound probe, the label is detected. For example a 
digoxygenin labeled riboprobe (RNA probe) that is complementary to the mRNA encoding a 
colorectal cancer protein is detected by binding the digoxygenin with an anti-digoxygenin 
secondary antibody and developed with nitro blue tetrazolium and 5-bromo-4-chloro-3- 
indoyl phosphate. 

15 [140] In a preferred embodiment, any of the three classes of proteins as 

described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
assays. The colorectal cancer proteins, antibodies, nucleic acids, modified proteins and cells 
containing colorectal cancer sequences are used in diagnostic assays. This can be done on an 
individual gene or corresponding polypeptide level. In a preferred embodiment, the 

20 expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

[141] As described and defined herein, colorectal cancer proteins, including 
intracellular, transmembrane or secreted proteins, find use as markers of colorectal cancer . 

25 Detection of these proteins in putative colorectal cancer tissue or patients allows for a 
determination or diagnosis of colorectal cancer . Numerous methods known to those of 
ordinary skill in the art find use in detecting colorectal cancer . In one embodiment, 
antibodies are used to detect colorectal cancer proteins. A preferred method separates 
proteins from a sample or patient by electrophoresis on a gel (typically a denaturing and 

30 reducing protein gel, but may be any other type of gel including isoelectric focusing gels and 
the like). Following separation of proteins, the colorectal cancer protein is detected by 
immunoblotting with antibodies raised against the colorectal cancer protein. Methods of 
immunoblotting are well known to those of ordinary skill in the art. 
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[142] In another preferred method, antibodies to the colorectal cancer 
protein find use in in situ imaging techniques. In this method cells are contacted with from 
one to many antibodies to the colorectal cancer protein(s). Following washing to remove 
non-specific antibody binding, the presence of the antibody or antibodies is detected. In one 
5 embodiment the antibody is detected by incubating with a secondary antibody that contains a 
detectable label. In another method the primary antibody to the colorectal cancer protein(s) 
contains a detectable label. In another preferred embodiment each one of multiple primary 
antibodies contains a distinct and detectable label. This method finds particular use in 
simultaneous screening for a plurality of colorectal cancer proteins. As will be appreciated 
10 by one of ordinary skill in the art, numerous other histological imaging techniques are useful 
in the invention. 

[143] In a preferred embodiment the label is detected in a fluorometer which 
has the ability to detect and distinguish emissions of different wavelengths. In addition, a 
fluorescence activated cell sorter (FACS) can be used in the method. 

15 [144] In another preferred embodiment, antibodies find use in diagnosing 

colorectal cancer from blood samples. As previously described, certain colorectal cancer 
proteins are secreted/circulating molecules. Blood samples, therefore, are useful as samples 
to be probed or tested for the presence of secreted colorectal cancer proteins. Antibodies can 
be used to detect the colorectal cancer by any of the previously described immunoassay 

20 techniques including ELISA, immunoblotting (Western blotting), immunoprecipitation, 

BIACORE technology and the like, as will be appreciated by one of ordinary skill in the art. 

[145] In a preferred embodiment, in situ hybridization of labeled colorectal 
cancer nucleic acid probes to tissue arrays is done. For example, arrays of tissue samples, 
including colorectal cancer tissue and/or normal tissue, are made. In situ hybridization as is 

25 known in the art can then be done. 

[146] It is understood that when comparing the fingerprints between an 
individual and a standard, the skilled artisan can make a diagnosis as well as a prognosis. It 
is further understood that the genes which indicate the diagnosis may differ from those which 
indicate the prognosis. 

30 [147] In a preferred embodiment, the colorectal cancer proteins, antibodies, 

nucleic acids, modified proteins and cells containing colorectal cancer sequences are used in 
prognosis assays. As above, gene expression profiles can be generated that correlate to 
colorectal cancer severity, in terms of long term prognosis. Again, this may be done on 
either a protein or gene level, with the use of genes being preferred. As above, the colorectal 
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cancer probes are attached to biochips for the detection and quantification of colorectal 
cancer sequences in a tissue or patient. The assays proceed as outlined for diagnosis. 

[148] In a preferred embodiment, any of the three classes of proteins as 
described herein are used in drug screening assays. The colorectal cancer proteins, 
5 antibodies, nucleic acids, modified proteins and cells containing colorectal cancer sequences 
are used in drug screening assays or by evaluating the effect of drug candidates on a "gene 
expression profile" or expression profile of polypeptides. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes after treatment with a candidate 

10 agent, Zlokarnik, et al., Science 279, 84-8 (1998), Heid, 1996 #69. 

[149] In a preferred embodiment, the colorectal cancer proteins, antibodies, 
nucleic acids, modified proteins and cells containing the native or modified colorectal cancer 
proteins are used in screening assays. That is, the present invention provides novel methods 
for screening for compositions which modulate the colorectal cancer phenotype. As above, 

1 5 this can be done on an individual gene level or by evaluating the effect of drug candidates on 
a "gene expression profile". In a preferred embodiment, the expression profiles are used, 
preferably in conjunction with high throughput screening techniques to allow monitoring for 
expression profile genes after treatment with a candidate agent, see Zlokarnik, supra. 
Having identified the differentially expressed genes herein, a variety of assays may be 

20 executed. In a preferred embodiment, assays may be run on an individual gene or protein 
level. That is, having identified a particular gene as up regulated in colorectal cancer , 
candidate bioactive agents may be screened to modulate this gene's response; preferably to 
down regulate the gene, although in some circumstances to up regulate the gene. 

[150] The phrase "functional effects" in the context of assays for testing 

25 compounds that modulate activity of a colorectal cancer protein or colorectal cancer nucleic 
acid includes the determination of a parameter that is indirectly or directly under the 
influence of a colorectal cancer protein or nucleic acid, e.g. , a physical (direct), or phenotypic 
or chemical effect (indirect), such as the ability to increase or decrease cellular proliferation. 
It includes cell cycle arrest, the ability of cells to proliferate, and other characteristics of 

30 proliferating cells. "Functional effects" include in vitro, in vivo, and ex vivo activities. 

[151] By "determining the functional effect" is meant assaying for a 
compound that increases or decreases a parameter that is indirectly or directly under the 
influence of a colorectal cancer protein or nucleic acid, e.g. 9 physical, phenotypic and 
chemical effects. Such functional effects can be measured by any means known to those 
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skilled in the art, e.g 9 physical effects such as changes in spectroscopic characteristics (e.g. 9 
fluorescence, absorbance, refractive index); hydrodynamic (e.g, shape); chromatographic; or 
solubility properties for the protein; measuring ligand binding activity or binding assays, e.g 
binding to antibodies; measuring changes in ligand binding activity; and chemical or 
5 phenotypic effects such as measuring inducible markers or transcriptional activation of the 
protein; measuring cellular proliferation; measuring cell surface marker expression; 
measurement of changes in protein levels for colorectal cancer-associated sequences; 
measurement of RNA stability; phosphorylation or dephosphorylation; signal transduction, 
e.g., receptor-ligand interactions, second messenger concentrations (e.g., cAMP, IP3, or 

10 intracellular Ca2 + ); identification of downstream or reporter gene expression (CAT, 

luciferase, P-gal, GFP and the like), e.g. 9 via chemiluminescence, fluorescence, colorimetric 
reactions, antibody binding, and inducible markers. 

[152] 'Inhibitors", "activators", and "modulators" of colorectal cancer 
polynucleotide and polypeptide sequences are used to refer to activating, inhibitory, or 

15 modulating molecules identified using in vitro and in vivo assays of colorectal cancer 
polynucleotide and polypeptide sequences. Inhibitors are compounds that, e.g., bind to, 
partially or totally block activity, decrease, prevent, delay activation, inactivate, desensitize, 
or down regulate the activity or expression of colorectal cancer proteins or nucleic acids, e.g, 
antagonists. "Activators" are compounds that increase, open, activate, facilitate, enhance 

20 activation, sensitize, agonize, or up regulate colorectal cancer protein or nucleic acid activity. 
Inhibitors, activators, or modulators also include genetically modified versions of colorectal 
cancer proteins, e.g., versions with altered activity, as well as naturally occurring and 
synthetic ligands, antagonists, agonists, antibodies, antisense molecules, peptides, ribozymes, 
small chemical molecules and the like. Such assays for inhibitors and activators include, e.g., 

25 expressing colorectal cancer protein in vitro, in cells, or cell membranes, applying putative 
modulator compounds, and then dete rmining the functional effects on activity, as described 
above. 

[153] Samples or assays comprising colorectal cancer proteins or colorectal 
cancer nucleic acids that are treated with a potential activator, inhibitor, or modulator are 
30 compared to control samples without the inhibitor, activator, or modulator to examine the 
extent of inhibition. Control samples (untreated with inhibitors) are assigned a relative 
activity value of 100%. Inhibition of colorectal cancer is achieved when the activity value 
relative to the control is about 80%, preferably 50%, more preferably 25-0%. Activation of 
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colorectal cancer is achieved when the activity value relative to the control (untreated with 
activators) is 110%, more preferably 150%, more preferably 200-500% (i.e., two to five fold 
higher relative to the control), more preferably 1000-3000% higher. 

[154] As will be appreciated by those in the art, this may be done by 
5 evaluation at either the gene or the protein level; that is, the amount of gene expression may 
be monitored using nucleic acid probes and the quantification of gene expression levels, or, 
alternatively, the gene product itself can be monitored, for example through the use of 
antibodies to the colorectal cancer protein and standard immunoassays. 

[155] In a preferred embodiment, gene expression monitoring is done and a 

10 number of genes, i.e. an expression profile, is monitored simultaneously, although multiple 
protein expression monitoring can be done as well. 

[156] In this embodiment, the colorectal cancer nucleic acid probes are 
attached to biochips as outlined herein for the detection and quantification of colorectal 
cancer sequences in a particular cell. The assays are further described below. 

1 5 [157] Generally, in a preferred embodiment, a candidate bioactive agent is 

added to the cells prior to analysis. Moreover, screens are provided to identify a candidate 
bioactive agent which modulates colorectal cancer, modulates colorectal cancer proteins, 
binds to a colorectal cancer protein, or interferes between the binding of a colorectal cancer 
protein and an antibody. 

20 [158] The term "candidate bioactive agent" or 'test compound" or "drug 

candidate" or "modulator" or grammatical equivalents as used herein describes any molecule, 
either naturally occurring or synthetic, e.g., protein, oligopeptide (e.g., from about 5 to about 
25 amino acids in length, preferably from about 10 to 20 or 12 to 18 amino acids in length, 
preferably 12, 15, or 18 amino acids in length), small organic molecule, polysaccharide, lipid, 

25 fatty acid, polynucleotide, oligonucleotide, etc., to be tested for the capacity to directly or 
indirectly modulate colorectal cancer sequences, including both nucleic acid and protein 
sequences. The test compound can be in the form of a library of test compounds, such as a 
combinatorial or randomized library that provides a sufficient range of diversity. Test 
compounds are optionally linked to a fusion partner, e.g., targeting compounds, rescue 

30 compounds, dimerization compounds, stabilizing compounds, addressable compounds, and 
other functional moieties. Conventionally, new chemical entities with useftd properties are 
generated by identifying a test compound (called a "lead compound") with some desirable 
property or activity, e.g., inhibiting activity, creating variants of the lead compound, and 
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evaluating the property and activity of those variant compounds. Often, high throughput 
screening (HTS) methods are employed for such an analysis. 

[159J In preferred embodiments, the bioactive agents modulate the 
expression profiles, or expression profile nucleic acids or proteins provided herein. In a 
5 particularly preferred embodiment, the candidate agent suppresses a colorectal cancer 
phenotype, for example to a normal colon tissue fingerprint Similarly, the candidate agent 
preferably suppresses a severe colorectal cancer phenotype. Generally a plurality of assay 
mixtures are run in parallel with different agent concentrations to obtain a differential 
response to the various concentrations. Typically, one of these concentrations serves as a 
10 negative control, i.e., at zero concentration or below the level of detection. 

[160] In one aspect, a candidate agent will neutralize the effect of a 
colorectal cancer protein. By "neutralize" is meant that activity of a protein is either 
inhibited or counter acted against so as to have substantially no effect on a cell. 

[161] Candidate agents encompass numerous chemical classes, though 
15 typically they are organic molecules, preferably small organic compounds having a molecular 
weight of more than 100 and less than about 2,500 daltons. Preferred small molecules are 
less than 2000, or less than 1500 or less than 1000 or less than 500 D. Candidate agents 
comprise functional groups necessary for structural interaction with proteins, particularly 
hydrogen bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl 
20 group, preferably at least two of the functional chemical groups. The candidate agents often 
comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic 
structures substituted with one or more of the above functional groups. Candidate agents are 
also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, 
pyrimidines, derivatives, structural analogs or combinations thereof. Particularly preferred 
25 are peptides. 

[162] Candidate agents are obtained from a wide variety of sources including 
libraries of synthetic or natural compounds. For example, numerous means are available for 
random and directed synthesis of a wide variety of organic compounds and biomolecules, 
including expression of randomized oligonucleotides. Alternatively, libraries of natural 
30 compounds in the form of bacterial, fungal, plant and animal extracts are available or readily 
produced. Additionally, natural or synthetically produced libraries and compounds are 
readily modified through conventional chemical, physical and biochemical means. Known 
pharmacological agents may be subjected to directed or random chemical modifications, such 
as acylation, alkylation, esterification, amidification to produce structural analogs. 
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[163] In a preferred embodiment, the candidate bioactive agents are 
proteins. By protein" herein is meant at least two covalently attached amino acids, which 
includes proteins, polypeptides, oligopeptides and peptides. The protein may be made up of 
naturally occurring amino acids and peptide bonds, or synthetic peptidomimetic structures. 
5 Thus "amino acid", or "peptide residue", as used herein means both naturally occurring and 
synthetic amino acids. For example, homo-phenylalanine, citrulline and noreleucine are 
considered amino acids for the purposes of the invention. "Amino acid" also includes imino 
acid residues such as proline and hydroxyproline. The side chains may be in either the (R) 
or the (S) configuration. In the preferred embodiment, the amino acids are in the (S) or L- 

10 configuration. If non-naturally occurring side chains are used, non-amino acid substituents 
may be used, for example to prevent or retard in vivo degradations. 

[164] In a preferred embodiment, the candidate bioactive agents are naturally 
occurring proteins or fragments of naturally occurring proteins. Thus, for example, cellular 
extracts containing proteins, or random or directed digests of proteinaceous cellular extracts, 

15 may be used. In this way libraries of procaryotic and eucaryotic proteins may be made for 
screening in the methods of the invention. Particularly preferred in this embodiment are 
libraries of bacterial, fungal, viral, and mammalian proteins, with the latter being preferred, 
and human proteins being especially preferred. 

[165] In a preferred embodiment, the candidate bioactive agents are peptides 

20 of from about 5 to about 30 amino acids, with from about 5 to about 20 amino acids being 
preferred, and from about 7 to about 15 being particularly preferred. The peptides may be 
digests of naturally occurring proteins as is outlined above, random peptides, or "biased" 
random peptides. By "randomized" or grammatical equivalents herein is meant that each 
nucleic acid and peptide consists of essentially random nucleotides and amino acids, 

25 respectively. Since generally these random peptides (or nucleic acids, discussed below) are 
chemically synthesized, they may incorporate any nucleotide or amino acid at any position. 
The synthetic process can be designed to generate randomized proteins or nucleic acids, to 
allow the formation of all or most of the possible combinations over the length of the 
sequence, thus forming a library of randomized candidate bioactive proteinaceous agents. 

30 [166] In one embodiment, the library is fully randomized, with no sequence 

preferences or constants at any position. In a preferred embodiment, the library is biased. 
That is, some positions within the sequence are either held constant, or are selected from a 
limited number of possibilities. For example, in a preferred embodiment, the nucleotides or 
amino acid residues are randomized within a defined class, for example, of hydrophobic 
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amino acids, hydrophilic residues, sterically biased (either small or large) residues, towards 
the creation of nucleic acid binding domains, the creation of cysteines, for cross-linking, 
prolines for SH-3 domains, serines, threonines, tyrosines or histidines for phosphorylation 
sites, etc., or to purines, etc. 
5 [167] In a preferred embodiment, the candidate bioactive agents are nucleic 

acids, as defined above. 

[168] As described above generally for proteins, nucleic acid candidate 
bioactive agents may be naturally occurring nucleic acids, random nucleic acids, or biased" 
random nucleic acids. For example, digests of procaryotic or eucaryotic genomes may be 
10 used as is outlined above for proteins. 

[169] In a preferred embodiment, the candidate bioactive agents are organic 
chemical moieties, a wide variety of which are available in the literature. 

[170] "Antibody" refers to a polypeptide comprising a framework region 
from an immunoglobulin gene or fragments thereof that specifically binds and recognizes an 
15 antigen. The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, 
delta, epsilon, and mu constant region genes, as well as the myriad immunoglobulin variable 
region genes. Light chains are classified as either kappa or lambda. Heavy chains are 
classified as gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin 
classes, IgG, IgM, IgA, IgD and IgE, respectively. Typically, the antigen-binding region of 
20 an antibody will be most critical in specificity and affinity of binding. 

[171] An exemplary immunoglobulin (antibody) structural unit comprises a 
tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair 
having one "lighf 9 (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus 
of each chain defines a variable region of about 100 to 1 10 or more amino acids primarily 
25 responsible for antigen recognition. The terms variable light chain (Vl) and variable heavy 
chain (V H ) refer to these light and heavy chains respectively. 

[172] Antibodies exist, e.g., as intact immunoglobulins or as a number of 
well-characterized fragments produced by digestion with various peptidases. Thus, for 
example, pepsin digests an antibody below the disulfide linkages in the hinge region to 
30 produce F(ab)'2, a dimer of Fab which itself is a light chain joined to Vh-Ch1 by a disulfide 
bond. The F(ab)'2 may be reduced under mild conditions to break the disulfide linkage in the 
hinge region, thereby converting the F(ab) , 2 dimer into an Fab' monomer. The Fab' 
monomer is essentially Fab with part of the hinge region (see Fundamental Immunology 
(Paul ed., 3d ed. 1993). While various antibody fragments are defined in terms of the 
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digestion of an intact antibody, one of skill will appreciate that such fragments may be 
synthesized de novo either chemically or by using recombinant DNA methodology. Thus, 
the term antibody, as used herein, also includes antibody fragments either produced by the 
modification of whole antibodies, or those synthesized de novo using recombinant DNA 
5 methodologies (e.g., single chain Fv) or those identified using phage display libraries {see, 
e.g, McCafferty etal, Nature 348:552-554 (1990)) 

[173] For preparation of antibodies, e.g., recombinant, monoclonal, or 
polyclonal antibodies, many technique known in the art can be used {see, e.g., Kohler & 
Milstein, Nature 256:495-497 (1975); Kozbor et al, Immunology Today 4: 72 (1983); Cole et 

10 al, pp. 77-96 in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc. (1985); 
Coligan, Current Protocols in Immunology (1991); Harlow & Lane, Antibodies, A 
Laboratory Manual (1988); and Goding, Monoclonal Antibodies: Principles and Practice (2d 
ed. 1986)). The genes encoding the heavy and light chains of an antibody of interest can be 
cloned from a cell, e.g., the genes encoding a monoclonal antibody can be cloned from a 

1 5 hybridoma and used to produce a recombinant monoclonal antibody. Gene libraries encoding 
heavy and light chaims of monoclonal antibodies can also be made from hybridoma or 
plasma cells. Random combinations of the heavy and light chain gene products generate a 
large pool of antibodies with different antigenic specificity {see, e.g., Kuby, Immunology (3 rd 
ed. 1997)). Techniques for the production of single chain antibodies or recombinant 

20 antibodies (U.S. Patent 4,946,778, U.S. Patent No. 4,816,567) can be adapted to produce 
antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms such 
as other mammals, may be used to express humanized or human antibodies {see, e.g., U.S. 
Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661,016, Marks et al, 
Bio/Technology 10:779-783 (1992); Lonberg et al, Nature 368:856-859 (1994); Morrison, 

25 Nature 368:812-13 (1994); Fishwild et al, Nature Biotechnology 14:845-51 (1996); 
Neuberger, Nature Biotechnology 14:826 (1996); and Lonberg & Huszar, Intern. Rev. 
Immunol. 13:65-93 (1995)). Alternatively, phage display technology can be used to identify 
antibodies and heteromeric Fab fragments that specifically bind to selected antigens {see, e.g. , 
McCafferty et al, Nature 348:552-554 (1990); Marks et al, Biotechnology 10:779-783 

30 (1992)). Antibodies can also be made bispecific, i.e., able to recognize two different antigens 
{see, e.g, WO 93/08829, Traunecker et al, EMBO J. 10:3655-3659 (1991); and Suresh et al, 
Methods in Enzymology 121:210 (1986)). Antibodies can also be heteroconjugates, e.g., two 
covalently joined antibodies, or immunotoxins {see, e.g., U.S. Patent No. 4,676,980 , WO 
91/00360; WO 92/200373; and EP 03089). 
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[174] Methods for humanizing or primatizing non-human antibodies are well 
known in the art. Generally, a humanized antibody has one or more amino acid residues 
introduced into it from a source which is non-human. These non-human amino acid residues 
are often referred to as import residues, which are typically taken from an import variable 
5 domain. Humanization can be essentially performed following the method of Winter and co- 
workers {see, e.g., Jones et al., Nature 321:522-525 (1986); Riechmann et al, Nature 
332:323-327 (1988); Verhoeyen et al, Science 239:1534-1536 (1988) and Presta, Curr. Op. 
Struct Biol 2:593-596 (1992)), by substituting rodent CDRs or CDR sequences for the 
corresponding sequences of a human antibody. Accordingly, such humanized antibodies are 

10 chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact 
human variable domain has been substituted by the corresponding sequence from a non- 
human species. In practice, humanized antibodies are typically human antibodies in which 
some CDR residues and possibly some FR residues are substituted by residues from 
analogous sites in rodent antibodies. 

1 5 [175] A "chimeric antibody" is an antibody molecule in which (a) the 

constant region, or a portion thereof, is altered, replaced or exchanged so that the antigen 
binding site (variable region) is linked to a constant region of a different or altered class, 
effector function and/or species, or an entirely different molecule which confers new 
properties to the chimeric antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, 

20 etc.; or (b) the variable region, or a portion thereof, is altered, replaced or exchanged with a 
variable region having a different or altered antigen specificity. 

[176] In one embodiment, the antibody is conjugated to an "effector" moiety. 
The effector moiety can be any number of molecules, including labeling moieties such as 
radioactive labels or fluorescent labels, or can be a therapeutic moiety. In one aspect the 

25 antibody modulates the activity of the protein. 

[177] After the candidate agent has been added and the cells allowed to 
incubate for some period of time, the sample containing the target sequences to be analyzed is 
added to the biochip. If required, the target sequence is prepared using known techniques. 
For example, the sample may be treated to lyse the cells, using known lysis buffers, 

30 electroporation, etc., with purification and/or amplification such as PCR occurring as needed, 
as will be appreciated by those in the art. For example, an in vitro transcription with labels 
covalently attached to the nucleosides is done. Generally, the nucleic acids are labeled with 
biotin-FITC or PE, or with cy3 or cy5. 
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[178] In a preferred embodiment, the target sequence is labeled with, for 
example, a fluorescent, a chemUuminescent, a chemical, or a radioactive signal, to provide a 
means of detecting the target sequence's specific binding to a probe. The label also can be an 
enzyme, such as, alkaline phosphatase or horseradish peroxidase, which when provided with 
5 an appropriate substrate produces a product that can be detected. Alternatively, the label can 
be a labeled compound or small molecule, such as an enzyme inhibitor, that binds but is not 
catalyzed or altered by the enzyme. The label also can be a moiety or compound, such as, an 
epitope tag or biotin which specifically binds to streptavidin. For the example of biotin, the 
streptavidin is labeled as described above, thereby, providing a detectable signal for the 
10 bound target sequence. As known in the art, unbound labeled streptavidin is removed prior to 
analysis. 

[179] As will be appreciated by those in the art, these assays can be direct 
hybridization assays or can comprise "sandwich assays", which include the use of multiple 
probes, as is generally outlined in U.S. Patent Nos. 5,681,702, 5,597,909, 5,545,730, 

15 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 
5,594,118, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by 
reference. In this embodiment, in general, the target nucleic acid is prepared as outlined 
above, and then added to the biochip comprising a plurality of nucleic acid probes, under 
conditions that allow the formation of a hybridization complex. 

20 [180] A variety of hybridization conditions may be used in the present 

invention, including high, moderate and low stringency conditions as outlined above. The 
assays are generally run under stringency conditions which allows formation of the label 
probe hybridization complex only in the presence of target. Stringency can be controlled by 
altering a step parameter that is a thermodynamic variable, including, but not limited to, 

25 temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, 
organic solvent concentration, etc. 

[181] These parameters may also be used to control non-specific binding, as 
is generally outlined in U.S. Patent No. 5,681,697. Thus it may be desirable to perform 
certain steps at higher stringency conditions to reduce non-specific binding. 

30 [1 82] The reactions outlined herein may be accomplished in a variety of 

ways, as will be appreciated by those in the art. Components of the reaction may be added 
simultaneously, or sequentially, in any order, with preferred embodiments outlined below. In 
addition, the reaction may include a variety of other reagents may be included in the assays. 
These include reagents like salts, buffers, neutral proteins, e.g. albumin, detergents, etc which 
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may be used to facilitate optimal hybridization and detection, and/or reduce non-specific or 
background interactions. Also reagents that otherwise improve the efficiency of the assay, 
such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., maybe used, 
depending on the sample preparation methods and purity of the target. 
5 [183] Once the assay is run, the data is analyzed to determine the expression 

levels, and changes in expression levels as between states, of individual genes, forming a 
gene expression profile. 

[184] The screens are done to identify drugs or bioactive agents that 
modulate the colorectal cancer phenotype. Specifically, there are several types of screens 

10 that can be run. A preferred embodiment is in the screening of candidate agents that can 
induce or suppress a particular expression profile, thus preferably generating the associated 
phenotype. That is, candidate agents that can mimic or produce an expression profile in 
colorectal cancer similar to the expression profile of normal colon tissue is expected to result 
in a suppression of the colorectal cancer phenotype. Thus, in this embodiment, mimicking an 

15 expression profile, or changing one profile to another, is the goal. 

[185] In a preferred embodiment, as for the diagnosis and prognosis 
applications, having identified the differentially expressed genes important in any one state, 
screens can be run to alter the expression of the genes individually. That is, screening for 
modulation of regulation of expression of a single gene can be done; that is, rather than try to 

20 mimic all or part of an expression profile, screening for regulation of individual genes can be 
done. Thus, for example, particularly in the case of target genes whose presence or absence 
is unique between two states, screening is done for modulators of the target gene expression. 

[186] In a preferred embodiment, screening is done to alter the biological 
function of the expression product of the differentially expressed gene. Again, having 

25 identified the importance of a gene in a particular state, screening for agents that bind and/or 
modulate the biological activity of the gene product can be run as is more fully outlined 
below. 

[187] Thus, screening of candidate agents that modulate the colorectal cancer 
phenotype either at the gene expression level or the protein level can be done. 
30 [188] In addition screens can be done for novel genes that are induced in 

response to a candidate agent. After identifying a candidate agent based upon its ability to 
suppress a colorectal cancer expression pattern leading to a normal expression pattern, or 
modulate a single colorectal cancer gene expression profile so as to mimic the expression of 
the gene from normal tissue, a screen as described above can be performed to identify genes 

51 



WO 02/21996 



PCT/US01/28716 



that are specifically modulated in response to the agent. Comparing expression profiles 
between normal tissue and agent treated colorectal cancer tissue reveals genes that are not 
expressed in normal tissue or colorectal cancer tissue, but are expressed in agent treated 
tissue. These agent specific sequences can be identified and used by any of the methods 
5 described herein for colorectal cancer genes or proteins. In particular these sequences and 
the proteins they encode find use in marking or identifying agent treated cells. In addition, 
antibodies can be raised against the agent induced proteins and used to target novel 
therapeutics to the treated colorectal cancer tissue sample. 

[189] Thus, in one embodiment, a candidate agent is administered to a 

10 population of colorectal cancer cells, that thus has an associated colorectal cancer 

expression profile. By "administration" or "contacting" herein is meant that the candidate 
agent is added to the cells in such a manner as to allow the agent to act upon the cell, whether 
by uptake and intracellular action, or by action at the cell surface. In some embodiments, 
nucleic acid encoding a proteinaceous candidate agent (i.e. a peptide) may be put into a viral 

1 5 construct such as a retroviral construct and added to the cell, such that expression of the 
peptide agent is accomplished; see PCT US97/01019, hereby expressly incorporated by 
reference. 

[190] Once the candidate agent has been administered to the cells, the cells 
can be washed if desired and are allowed to incubate under preferably physiological 

20 conditions for some period of time. The cells are then harvested and a new gene expression 
profile is generated, as outlined herein. 

[191] Thus, for example, colorectal cancer tissue may be screened for 
agents that reduce or suppress the colorectal cancer phenotype. A change in at least one 
gene of the expression profile indicates that the agent has an effect on colorectal cancer 

25 activity. By defining such a signature for the colorectal cancer phenotype, screens for new 
drugs that alter the phenotype can be devised. With this approach, the drug target need not be 
known and need not be represented in the original expression screening platform, nor does 
the level of transcript for the target protein need to change. 

[192] In a preferred embodiment, as outlined above, screens may be done on 

30 individual genes and gene products (proteins). That is, having identified a particular 

differentially expressed gene as important in a particular state, screening of modulators of 
either the expression of the gene or the gene product itself can be done. The gene products of 
differentially expressed genes are sometimes referred to herein as "colorectal cancer 
modulator proteins". The colorectal cancer modulator protein may be a fragment, or 
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alternatively, be the full length protein to a fragment shown herein. Preferably, the colorectal 
cancer modulator protein is a fragment of approximately 14 to 24 amino acids long. More 
preferably the fragment is a soluble fragment 

1193] In a preferred embodiment, the fragment is charged and from the c- 
5 terminus. In one embodiment, the c-terminus of the fragment is kept as a free acid and the n- 
terminus is a free amine to aid in coupling, i.e., to cysteine. In another embodiment, the 
fragment is an internal peptide overlapping hydrophilic stretch the protein. In a preferred 
embodiment, the termini is blocked. In another preferred embodiment, the fragment is a 
novel fragment from the N-terminal. In one embodiment, the fragment excludes sequence 
1 0 outside of the N-terminal, in another embodiment, the fragment includes at least a portion of 
the N-terminal. "N-terminaT is used interchangeably herein with "N-terminus" which is 
further described above. 

[194] In one embodiment the colorectal cancer proteins are conjugated to an 
immunogenic agent as discussed herein. In one embodiment the colorectal cancer protein is 
1 5 conjugated to BS A. 

[195] Thus, in a preferred embodiment, screening for modulators of 
expression of specific genes can be done. This will be done as outlined above, but in general 
the expression of only one or a few genes are evaluated. 

[196] In a preferred embodiment, screens are designed to first find candidate 
20 agents that can bind to differentially expressed proteins, and then these agents may be used in 
assays that evaluate the ability of the candidate agent to modulate differentially expressed 
activity. Thus, as will be appreciated by those in the art, there are a number of different 
assays which may be run; binding assays and activity assays. 

[197] In a preferred embodiment, binding assays are done. In general, 
25 purified or isolated gene product is used; that is, the gene products of one or more 

differentially expressed nucleic acids are made. In general, this is done as is known in the art. 
For example, antibodies are generated to the protein gene products, and standard 
immunoassays are run to determine the amount of protein present Alternatively, cells 
comprising the colorectal cancer proteins can be used in the assays. 
30 [198] Thus, in a preferred embodiment, the methods comprise combining a 

colorectal cancer protein and a candidate bioactive agent, and determining the binding of the 
candidate agent to the colorectal cancer protein. Preferred embodiments utilize the human 
colorectal cancer protein, although other mammalian proteins may also be used, for example 
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for the development of animal models of human disease. In some embodiments, as outlined 
herein, variant or derivative colorectal cancer proteins may be used. 

[199] Generally, in a preferred embodiment of the methods herein, the 
colorectal cancer protein or the candidate agent is non-diffusably bound to an insoluble 
5 support having isolated sample receiving areas (e.g. a microtiter plate, an array, etc.). The 
insoluble supports may be made of any composition to which the compositions can be bound, 
is readily separated from soluble material, and is otherwise compatible with the overall 
method of screening. The surface of such supports may be solid or porous and of any 
convenient shape. Examples of suitable insoluble supports include microtiter plates, arrays, 

10 membranes and beads. These are typically made of glass, plastic (e.g., polystyrene), 
polysaccharides, nylon or nitrocellulose, teflon, etc. Microtiter plates and arrays are 
especially convenient because a large number of assays can be carried out simultaneously, 
using small amounts of reagents and samples. The particular manner of binding of the 
composition is not crucial so long as it is compatible with the reagents and overall methods of 

15 the invention, maintains the activity of the composition and is nondifiusable. Preferred 
methods of binding include the use of antibodies (which do not sterically block either the 
ligand binding site or activation sequence when the protein is bound to the support), direct 
binding to "sticky" or ionic supports, chemical crosslinking, the synthesis of the protein or 
agent on the surface, etc. Following binding of the protein or agent, excess unbound material 

20 is removed by washing. The sample receiving areas may then be blocked through incubation 
with bovine serum albumin (BS A), casein or other innocuous protein or other moiety. 

[200] In a preferred embodiment, the colorectal cancer protein is bound to 
the support, and a candidate bioactive agent is added to the assay. Alternatively, the 
candidate agent is bound to the support and the colorectal cancer protein is added. Novel 

25 binding agents include specific antibodies, non-natural binding agents identified in screens of 
chemical libraries, peptide analogs, etc. Of particular interest are screening assays for agents 
that have a low toxicity for human cells. A wide variety of assays may be used for this 
purpose, including labeled in vitro protein-protein binding assays, electrophoretic mobility 
shift assays, immunoassays for protein binding, functional assays (phosphorylation assays, 

30 etc.) and the like. 

[201] The determination of the binding of the candidate bioactive agent to 
the colorectal cancer protein may be done in a number of ways. In a preferred embodiment, 
the candidate bioactive agent is labeled, and binding determined directly. For example, this 
may be done by attaching all or a portion of the colorectal cancer protein to a solid support, 
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adding a labeled candidate agent (for example a fluorescent label), washing off excess 
reagent, and determining whether the label is present on the solid support. Various blocking 
and washing steps may be utilized as is known in the art. 

[202] By "labeled" herein is meant that the compound is either directly or 
5 indirectly labeled with a label which provides a detectable signal, e.g. radioisotope, 

fluorescers, enzyme, antibodies, particles such as magnetic particles, chemiluminescers, or 
specific binding molecules, etc. Specific binding molecules include pairs, such as biotin and 
streptavidin, digoxin and antidigoxin etc. For the specific binding members, the 
complementary member would normally be labeled with a molecule which provides for 
10 detection, in accordance with known procedures, as outlined above. The label can directly or 
indirectly provide a detectable signal. 

[203] In some embodiments, only one of the components is labeled. For 
example, the proteins (or proteinaceous candidate agents) may be labeled at tyrosine 
positions using 1251, or with fluorophores. Alternatively, more than one component may be 
15 labeled with different labels; using 125 I for the proteins, for example, and a fluorophor for the 
candidate agents. 

[204] In a preferred embodiment, the binding of the candidate bioactive 
agent is determined through the use of competitive binding assays. In this embodiment, the 
competitor is a binding moiety known to bind to the target molecule (i.e. colorectal, cancer ), 

20 such as an antibody, peptide, binding partner, ligand, etc. Under certain circumstances, there 
may be competitive binding as between the bioactive agent and the binding moiety, with the 
binding moiety displacing the bioactive agent. 

[205] In one embodiment, the candidate bioactive agent is labeled. Either 
the candidate bioactive agent, or the competitor, or both, is added first to the protein for a 

25 time sufficient to allow binding, if present. Incubations may be performed at any 

temperature which facilitates optimal activity, typically between 4 and 40°C. Incubation 
periods are selected for optimum activity, but may also be optimized to facilitate rapid high 
through put screening. Typically between 0.1 and 1 hour will be sufficient. Excess reagent is 
generally removed or washed away. The second component is then added, and the presence 

30 or absence of the labeled component is followed, to indicate binding. 

[206] In a preferred embodiment, the competitor is added first, followed by 
the candidate bioactive agent. Displacement of the competitor is an indication that the 
candidate bioactive agent is binding to the colorectal cancer protein and thus is capable of 
binding to, and potentially modulating, the activity of the colorectal cancer protein. In this 
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embodiment, either component can be labeled Thus, for example, if the competitor is 
labeled, the presence of label in the wash solution indicates displacement by the agent. 
Alternatively, if the candidate bioactive agent is labeled, the presence of the label on the 
support indicates displacement. 
5 [2071 In an alternative embodiment, the candidate bioactive agent is added 

first, with incubation and washing, followed by the competitor. The absence of binding by 
the competitor may indicate that the bioactive agent is bound to the colorectal cancer protein 
with a higher affinity. Thus, if the candidate bioactive agent is labeled, the presence of the 
label on the support, coupled with a lack of competitor binding, may indicate that the 

10 candidate agent is capable of binding to the colorectal cancer protein. 

[208] In a preferred embodiment, the methods comprise differential 
screening to identity bioactive agents that are capable of modulating the activity of the 
colorectal cancer proteins. In this embodiment, the methods comprise combining a 
colorectal cancer protein and a competitor in a first sample. A second sample comprises a 

15 candidate bioactive agent, a colorectal cancer protein and a competitor. The binding of the 
competitor is determined for both samples, and a change, or difference in binding between 
the two samples indicates the presence of an agent capable of binding to the colorectal 
cancer protein and potentially modulating its activity. That is, if the binding of the 
competitor is different in the second sample relative to the first sample, the agent is capable 

20 of binding to the colorectal cancer protein. 

[209] Alternatively, a preferred embodiment utilizes differential screening to 
identify drug candidates that bind to the native colorectal cancer protein, but cannot bind to 
modified colorectal cancer proteins. The structure of the colorectal cancer protein may be 
modeled, and used in rational drug design to synthesize agents that interact with that site. 

25 Drug candidates that affect colorectal cancer bioactivity are also identified by screening 
drugs for the ability to either enhance or reduce the activity of the protein. 

[210] Positive controls and negative controls may be used in the assays. 
Preferably all control and test samples are performed in at least triplicate to obtain 
statistically significant results. Incubation of all samples is for a time sufficient for the 

30 binding of the agent to the protein. Following incubation, all samples are washed free of non- 
specifically bound material and the amount of bound, generally labeled agent determined. 
For example, where a radiolabel is employed, the samples may be counted in a scintillation 
counter to determine the amount of bound compound. 
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[211] A variety of other reagents may be included in the screening assays. 
These include reagents like salts, neutral proteins, e.g. albumin, detergents, etc which may be 
used to facilitate optimal protein-protein binding and/or reduce non-specific or background 
interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
5 protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in any order that provides for the requisite binding. 

[212] Screening for agents that modulate the activity of colorectal cancer 
proteins may also be done. In a preferred embodiment, methods for screening for a bioactive 
agent capable of modulating the activity of colorectal cancer proteins comprise the steps of 

10 adding a candidate bioactive agent to a sample of colorectal cancer proteins, as above, and 
determining an alteration in the biological activity of colorectal cancer proteins. 
"Modulating the activity of colorectal cancer " includes an increase in activity, a decrease in 
activity, or a change in the type or kind of activity present. Thus, in this embodiment, the 
candidate agent should both bind to colorectal cancer proteins (although this may not be 

1 5 necessary), and alter its biological or biochemical activity as defined herein. The methods 
include both in vitro screening methods, as are generally outlined above, and in vivo 
screening of cells for alterations in the presence, distribution, activity or amount of colorectal 
cancer proteins. 

[213] Thus, in this embodiment, the methods comprise combining a 
20 colorectal cancer sample and a candidate bioactive agent, and evaluating the effect on 

colorectal cancer activity. By "colorectal cancer activity" or grammatical equivalents herein 
is meant one of the colorectal cancer 's biological activities, including, but not limited to, cell 
division, preferably in colon tissue, cell proliferation, tumor growth, transformation of cells. 
In one embodiment, colorectal cancer activity includes activation of a gene identified by a 
25 nucleic acid of Table 1. An inhibitor of colorectal cancer activity is the inhibition of any one 
or more colorectal cancer activities. 

[214] In a preferred embodiment, the activity of the colorectal cancer protein 
is increased; in another preferred embodiment, the activity of the colorectal cancer protein is 
decreased. Thus, bioactive agents that are antagonists are preferred in some embodiments, 
30 and bioactive agents that are agonists may be preferred in other embodiments. 

[215] In a preferred embodiment, the invention provides methods for 
screening for bioactive agents capable of modulating the activity of a colorectal cancer 
protein. The methods comprise adding a candidate bioactive agent, as defined above, to a 
cell comprising colorectal cancer proteins. Preferred cell types include almost any cell. The 
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cells contain a recombinant nucleic acid that encodes a colorectal cancer protein. In a 
preferred embodiment, a library of candidate agents are tested on a plurality of cells. 

[216] In one aspect, the assays are evaluated in the presence or absence or 
previous or subsequent exposure of physiological signals, for example hormones, antibodies, 
5 peptides, antigens, cytokines, growth factors, action potentials, pharmacological agents 
including chemotherapeutics, radiation, carcinogenics, or other cells (i.e. cell-cell contacts). 
In another example, the determinations are determined at different stages of the cell cycle 
process. 

[217] In this way, bioactive agents are identified. Compounds with 

10 pharmacological activity are able to enhance or interfere with the activity of the colorectal 
cancer protein. In one embodiment, "colorectal cancer protein activity" as used herein 
includes at least one of the following: colorectal cancer activity, binding to the colorectal 
cancer protein, activation of the colorectal cancer protein or activation of substrates of the 
colorectal cancer protein by the colorectal cancer protein. In one embodiment, colorectal 

1 5 cancer activity is defined as the unregulated proliferation of colon tissue, or the growth of 
cancer in colon tissue. In one aspect, colorectal cancer activity as defined herein is related to 
the activity of the colorectal cancer protein in the upregulation of the colorectal cancer 
protein in colon cancer tissue. 

[218] In another embodiment, colorectal cancer protein activity includes at 

20 least one of the following: colorectal cancer activity, binding to the CBF9 nucleic acid or 
poly peptide of Table 2 or binding toa nucleic acid of Table 1, or a peptide encoded by a 
nucleic acid of Table 1 or activation of substrates of the gene products identified by a nucleic 
acid of Table 1 or substrates of CBF9, which is shown in Table 2. In one aspect, colorectal 
cancer activity as defined herein is related to the activity of genes defined by the nucleic acids 

25 of Table 1 or of CBF9 as defined in Table 2, in colon cancer tissue. 

[219] In one embodiment, a method of inhibiting colon cancer cell division is 
provided. The method comprises administration of a colorectal cancer inhibitor. 

[220] In another embodiment, a method of inhibiting tumor growth is 
provided. The method comprises administration of a colorectal cancer inhibitor. 

30 [221] In a further embodiment, methods of treating cells or individuals with 

cancer are provided. The method comprises administration of a colorectal cancer inhibitor. 

[222] In one embodiment, a colorectal cancer inhibitor is an antibody as 
discussed above. In another embodiment, the colorectal cancer inhibitor is an antisense 
molecule. Antisense molecules as used herein include antisense or sense oligonucleotides 
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comprising a singe-stranded nucleic acid sequence (either KNA or DNA) capable of binding 
to target mRNA (sense) or DNA (antisense) sequences for colorectal cancer molecules. A 
preferred antisense molecule is for the colorectal cancer sequences referenced in Table 1 or 
Table 2, or for a ligand or activator thereof. Antisense or sense oligonucleotides, according 
5 to the present invention, comprise a fragment generally at least about 14 nucleotides, 
preferably from about 14 to 30 nucleotides. The ability to derive an antisense or a sense 
oligonucleotide, based upon a cDNA sequence encoding a given protein is described in, for 
example, Stein and Cohen (Cancer Res. 48:2659, 1988) and van der Krol et al. 
(BioTechniques 6:958, 1988). 

10 [223] Antisense molecules may be introduced into a cell containing the target 

nucleotide sequence by formation of a conjugate with a ligand binding molecule, as described 
in WO 91/04753. Suitable ligand binding molecules include, but are not limited to, cell 
surface receptors, growth factors, other cytokines, or other ligands that bind to cell surface 
receptors. Preferably, conjugation of the ligand binding molecule does not substantially 

15 interfere with the ability of the ligand binding molecule to bind to its corresponding molecule 
or receptor, or block entry of the sense or antisense oligonucleotide or its conjugated version 
into the cell. Alternatively, a sense or an antisense oligonucleotide may be introduced into a 
cell containing the target nucleic acid sequence by formation of an oligonucleotide-lipid 
complex, as described in WO 90/10448. It is understood that the use of antisense molecules 

20 or knock out and knock in models may also be used in screening assays as discussed above, 
in addition to methods of treatment. 

[224] The compounds having the desired pharmacological activity may be 
administered in a physiologically acceptable carrier to a host, as previously described. The 
agents may be administered in a variety of ways, orally, parenterally e.g., subcutaneously, 

25 intraperitoneally, intravascularly, etc. Depending upon the manner of introduction, the 
compounds may be formulated in a variety of ways. The concentration of therapeutically 
active compound in the formulation may vary from about 0.1-100 wt.%. The agents may be 
administered alone or in combination with other treatments, Le., radiation. 

[225] The pharmaceutical compositions can be prepared in various forms, 

30 such as granules, tablets, pills, suppositories, capsules, suspensions, salves, lotions and the 
like. Pharmaceutical grade organic or inorganic carriers and/or diluents suitable for oral and 
topical use can be used to make up compositions containing the therapeutically-active 
compounds. Diluents known to the art include aqueous media, vegetable and animal oils and 
fats. Stabilizing agents, wetting and emulsifying agents, salts for varying the osmotic 
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pressure or buffers for securing an adequate pH value, and skin penetration enhancers can be 
used as auxiliary agents. 

[226] , Without being bound by theory, it appears that the various colorectal 
cancer sequences are important in colorectal cancer . Accordingly, disorders based on 
5 mutant or variant colorectal cancer genes may be determined. In one embodiment, the 
invention provides methods for identifying cells containing variant colorectal cancer genes 
comprising determining all or part of the sequence of at least one endogeneous colorectal 
cancer genes in a cell. As will be appreciated by those in the art, this may be done using any 
number of sequencing techniques. In a preferred embodiment, the invention provides 

10 methods of identifying the colorectal cancer genotype of an individual comprising 
determining all or part of the sequence of at least one colorectal cancer gene of the 
individual. This is generally done in at least one tissue of the individual, and may include the 
evaluation of a number of tissues or different samples of the same tissue. The method may 
include comparing the sequence of the sequenced colorectal cancer gene to a known 

15 colorectal cancer gene, i.e. a wild-type gene. 

[227] The sequence of all or part of the colorectal cancer gene can then be 
compared to the sequence of a known colorectal cancer gene to determine if any differences 
exist. This can be done using any number of known homology programs, such as Bestfit, etc. 
In a preferred embodiment, the presence of a a difference in the sequence between the 

20 colorectal cancer gene of the patient and the known colorectal cancer gene is indicative of a 
disease state or a propensity for a disease state, as outlined herein. 

[228] In a preferred embodiment, the colorectal cancer genes are used as 
probes to determine the number of copies of the colorectal cancer gene in the genome. 

[229] In another preferred embodiment colorectal cancer genes are used as 

25 probed to determine the chromosomal localization of the colorectal cancer genes. 

Information such as chromosomal localization finds use in providing a diagnosis or prognosis 
in particular when chromosomal abnormalities such as translocations, and the like are 
identified in colorectal cancer gene loci. 

[230] Thus, in one embodiment, methods of modulating colorectal cancer in 

30 cells or organisms are provided. In one embodiment, the methods comprise administering to 
a cell an anti-colorectal cancer antibody that reduces or eliminates the biological activity of 
an endogeneous colorectal cancer protein. Alternatively, the methods comprise 
administering to a cell or organism a recombinant nucleic acid encoding a colorectal cancer 
protein. As will be appreciated by those in the art, this may be accomplished in any number 
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of ways. In a preferred embodiment, for example when the colorectal cancer sequence is 
down-regulated in colorectal cancer , the activity of the colorectal cancer gene is increased 
by increasing the amount of colorectal cancer in the cell, for example by overexpressing the 
endogeneous colorectal cancer or by administering a gene encoding the colorectal cancer 
5 sequence, using known gene-therapy techniques, for example. In a preferred embodiment, 
the gene therapy techniques include the incorporation of the erogenous gene using enhanced 
homologous recombination (EHR), for example as described in PCT/US93/03868, hereby 
incorporated by reference in its entirety. Alternatively, for example when the colorectal 
cancer sequence is up-regulated in colorectal cancer , the activity of the endogeneous 

10 colorectal cancer gene is decreased, for example by the administration of a colorectal cancer 
antisense nucleic acid. 

[231] In one embodiment, the colorectal cancer proteins of the present 
invention may be used to generate polyclonal and monoclonal antibodies to colorectal cancer 
proteins, which are useful as described herein. Similarly, the colorectal cancer proteins can 

15 be coupled, using standard technology, to affinity chromatography columns. These columns 
may then be used to purify colorectal cancer antibodies. In a preferred embodiment, the 
antibodies are generated to epitopes unique to a colorectal cancer protein; that is, the 
antibodies show little or no cross-reactivity to other proteins. These antibodies find use in a 
number of applications. For example, the colorectal cancer antibodies may be coupled to 

20 standard affinity chromatography columns and used to purify colorectal cancer proteins. The 
antibodies may also be used as blocking polypeptides, as outlined above, since they will 
specifically bind to the colorectal cancer protein. 

[232] In one embodiment, a therapeutically effective dose of a colorectal 
cancer or modulator thereof is administered to a patient. By "therapeutically effective dose" 

25 herein is meant a dose that produces the effects for which it is administered. The exact dose 
will depend on the purpose of the treatment, and will be ascertainable by one skilled in the art 
using known techniques. As is known in the art, adjustments for colorectal cancer 
degradation, systemic versus localized delivery, and rate of new protease synthesis, as well as 
the age, body weight, general health, sex, diet, time of administration, drug interaction and 

30 the severity of the condition may be necessary, and will be ascertainable with routine 
experimentation by those skilled in the art. 

[233] A "patient" for the purposes of the present invention includes both 
humans and other animals, particularly mammals, and organisms. Thus the methods are 
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applicable to both human therapy and veterinary applications. In the preferred embodiment 
the patient is a mammal, and in the most preferred embodiment the patient is human. 

[234] The administration of the colorectal cancer proteins and modulators 
of the present invention can be done in a variety of ways as discussed above, including, but 
5 not limited to, orally, subcutaneousiy, intravenously, intranasally, transdermally, 

intraperitoneally, intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly. In 
some instances, for example, in the treatment of wounds and inflammation, the colorectal 
cancer proteins and modulators may be directly applied as a solution or spray. 

[235] The pharmaceutical compositions of the present invention comprise a 

1 0 colorectal cancer protein in a form suitable for administration to a patient. In the preferred 
embodiment, the pharmaceutical compositions are in a water soluble form, such as being 
present as pharmaceutically acceptable salts, which is meant to include both acid and base 
addition salts. "Pharmaceutically acceptable acid addition salt" refers to those salts that retain 
the biological effectiveness of the free bases and that are not biologically or otherwise 

15 undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 

sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 
methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 

20 "Pharmaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 
potassium, sodium, calcium, and magnesium salts. Salts derived from pharmaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 

25 substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

[236] The pharmaceutical compositions may also include one or more of the 
following: carrier proteins such as serum albumin; buffers; fillers such as microcrystalline 

30 cellulose, lactose, corn and other starches; binding agents; sweeteners and other flavoring 
agents; coloring agents; and polyethylene glycol. Additives are well known in the art, and 
are used in a variety of formulations. 

[237] In a preferred embodiment, colorectal cancer proteins and modulators 
are administered as therapeutic agents, and can be formulated as outlined above. Similarly, 
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colorectal cancer genes (including both the full-length sequence, partial sequences, or 
regulatory sequences of the colorectal cancer coding regions) can be administered in gene 
therapy applications, as is known in the art. These colorectal cancer genes can include 
antisense applications, either as gene therapy (i.e. for incorporation into the genome) or as 
5 antisense compositions, as will be appreciated by those in the art. 

[238] In a preferred embodiment, colorectal cancer genes are administered 
as DNA vaccines, either single genes or combinations of colorectal cancer genes. Naked 
DNA vaccines are generally known in the art. Brower, Nature Biotechnology, 16:1304-1305 
(1998). 

10 [2391 I* 1 °&e embodiment, colorectal cancer genes of the present invention 

are used as DNA vaccines. Methods for the use of genes as DNA vaccines are well known to 
one of ordinary skill in the art, and include placing a colorectal cancer gene or portion of a 
colorectal cancer gene under the control of a promoter for expression in a colorectal cancer 
patient. The colorectal cancer gene used for DNA vaccines can encode full-length colorectal 

15 cancer proteins, but more preferably encodes portions of the colorectal cancer proteins 
including peptides derived from the colorectal cancer protein. In a preferred embodiment a 
patient is immunized with a DNA vaccine comprising a plurality of nucleotide sequences 
derived from a colorectal cancer gene. Similarly, it is possible to immunize a patient with a 
plurality of colorectal cancer genes or portions thereof as defined herein. Without being 

20 bound by theory, expression of the polypeptide encoded by the DNA vaccine, cytotoxic T- 
cells, helper T-cells and antibodies are induced which recognize and destroy or eliminate 
cells expressing colorectal cancer proteins. 

[240] In a preferred embodiment, the DNA vaccines include a gene encoding 
an adjuvant molecule with the DNA vaccine. Such adjuvant molecules include cytokines that 

25 increase the immunogenic response to the colorectal cancer polypeptide encoded by the 

DNA vaccine. Additional or alternative adjuvants are known to those of ordinary skill in the 
art and find use in the invention. 

[241] In another preferred embodiment colorectal cancer genes find use in 
generating animal models of colorectal cancer . As is appreciated by one of ordinary skill in 

30 the art, when the colorectal cancer gene identified is repressed or diminished in colorectal 
cancer tissue, gene therapy technology wherein antisense RNA directed to the colorectal 
cancer gene will also diminish or repress expression of the gene. An animal generated as 
such serves as an animal model of colorectal cancer that finds use in screening bioactive 
drug candidates. Similarly, gene knockout technology, for example as a result of 
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homologous recombination with an appropriate gene targeting vector, will result in the 
absence of the colorectal cancer protein. When desired, tissue-specific expression or 
knockout of the colorectal cancer protein may be necessary. 

1242] It is also possible that the colorectal cancer protein is overexpressed in 
colorectal cancer. As such, transgenic animals can be generated that overexpress the 
colorectal cancer protein. Depending on the desired expression level, promoters of various 
strengths can be employed to express the transgene. Also, the number of copies of the 
integrated transgene can be determined and compared for a detennination of the expression 
level of the transgene. Animals generated by such methods find use as animal models of 
colorectal cancer and are additionally useful in screening for bioactive molecules to treat 
colorectal cancer . 

EXAMPLES 

[243] It is understood that the examples described herein in no way serve to 
limit the true scope of this invention, but rather are presented for illustrative purposes. All 
references and sequences of accession numbers cited herein are incorporated by reference in 
their entirety. 

[244] Example 1 

Tissue Preparation, Labeling Chips, and Fingerprints 

[245] Purify total RNA from tissue using TRIzol Reagent 
[246] Estimate tissue weight. Homogenize tissue samples in 1ml of TRIzol 
per 50mg of tissue using a Polytron 3 1 00 homogenizes The generator/probe used depends 
upon the tissue size. A generator that is too large for the amount of tissue to be homogenized 
will cause a loss of sample and lower RNA yield. Use the 20mm generator for tissue 
weighing more than 0.6g. If the working volume is greater than 2ml, then homogenize tissue 
in a 1 5ml polypropylene tube (Falcon 2059). Fill tube no greater than 1 0ml. 

HOMOGENIZATION 
[247] Before using generator, it should have been cleaned after last usage by 
running it through soapy H20 and rinsing thoroughly. Run through with EtOH to sterilize. 
Keep tissue frozen until ready. Add TRIzol directly to frozen tissue then homogenize. 
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[248] Following homogenization, remove insoluble material from the 
homogenate by centrifiigation at 7500 x g for 15 min. in a Sorvall superspeed or 12,000 x g 
for 10 min. in an Eppendorf centrifuge at 4oC. Transfer the cleared homogenate to a new 
tube(s). The samples may be frozen now at -60 to -70oC (and kept for at least one month) or 
5 you may continue with the purification. 

PHASE SEPARATION 
[249] Incubate the homogenized samples for 5 minutes at room temperature. 
[250] Add 0.2ml of chloroform per 1ml of TRIzol reagent used in the 
10 original homogenization. 

[251] Cap tubes securely and shake tubes vigorously by hand (do not vortex) 

for 15 seconds. 

[252] Incubate samples at room temp, for 2-3 minutes. Centrifuge samples 
at 6500rpm in a Sorvall superspeed for 30 min. at 4oC. (You may spin at up to 12,000 x g 
15 for 1 0 min. but you risk breaking your tubes in the centrifuge.) 

RNA PRECIPITATION 
[253] Transfer the aqueous phase to a fresh tube. Save the organic phase if 
isolation of DNA or protein is desired. Add 0.5ml of isopropyl alcohol per 1ml of TRIzol 
20 reagent used in the original homogenization. Cap tubes securely and invert to mix. Incubate 
samples at room temp, for 10 minutes. Centrifuge samples at 6500rpm in Sorvall for 20min. 
at4oC. 



RNA WASH 

25 [254] Pour off the supemate. Wash pellet with cold 75% ethanol. Use 1ml 

of 75% ethanol per 1ml of TRIzol reagent used in the initial homogenization. Cap tubes 
securely and invert several times to loosen pellet. (Do not vortex). Centrifuge at <8000rpm 
(<7500xg) for 5 minutes at 4oC. 

[255] Pour off the wash. Carefully transfer pellet to an eppendorf tube (let it 

30 slide down the tube into the new tube and use a pipet tip to help guide it in if necessary). 

Depending on the volumes you are working with, you can decide what size tube(s) you want 
to precipitate the RNA in. When I tried leaving the RNA in the large 1 5ml tube, it took so 
long to dry (i.e. it did not dry) that I eventually had to transfer it to a smaller tube. Let pellet 
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diyinhood. Resuspend RNA in an appropriate volume of DEPCH20. Try for 2-5ug/ul. 
Take absorbance readings. 

• [256] Purify poly A+ mRNA from total RNA or clean up total RNA with 

5 Qiagen'sRNeasykit 

[257] Purification of poly A+ mRNA from total RNA. Heatoligotex 
suspension to 37oC and mix immediately before adding to RNA. Incubate Elution Buffer at 
70oC. Wann up 2 x Binding Buffer at 65oC if there is precipitate in the buffer. Mix total 
RNA with DEPC-treated water, 2 x Binding Buffer, and Oligotex according to Table 2 on 
10 page 16 of the Oligotex Handbook, Incubate for 3 minutes at 65oC. Incubate for 10 minutes 
at room temperature. 

[258] Centrifuge for 2 minutes at 14,000 to 18,000 g. If centrifuge has a 
"soft setting," then use it. Remove supernatant without disturbing Oligotex pellet A little bit 
15 of solution can be left behind to reduce the loss of Oligotex. Save sup until certain that 
satisfactory binding and elution of poly A+ mRNA has occurred. 

[259] Gently resuspend in Wash Buffer OW2 and pipet onto spin column. 
Centrifuge the spin column at full speed (soft setting if possible) for 1 minute. 

20 

[260] Transfer spin column to a new collection tube and gently resuspend in 
Wash Buffer OW2 and centrifuge as describe herein. 

[261] Transfer spin column to a new tube and elute with 20 to 100 ul of 
25 preheated (70oC) Elution Buffer. Gently resuspend Oligotex resin by pipetting up and down. 
Centrifuge as above. Repeat elution with fresh elution buffer or use first eluate to keep the 
elution volume low. 

[262] Read absorbance, using diluted Elution Buffer as the blank. 

30 

[263] Before proceeding with cDNA synthesis, the mRNA must be 
precipitated. Some component leftover or in the Elution Buffer from the Oligotex 
purification procedure will inhibit downstream enzymatic reactions of the mRNA. 
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Ethanol Precipitation 
[264] Add 0.4 vol. of 7.5 M NH40Ac + 2.5 vol. of cold 100% ethanol. 
Precipitate at -20oC 1 hour to overnight (or 20-30 min. at -70oC). Centrifuge at 14,000- 
16,000 x g for 30 minutes at 4oC. Wash pellet with 0.5ml of 80%ethanol (-20oC) then 
5 centrifuge at 14,000-16,000 x g for 5 minutes at room temperature. Repeat 80% ethanol 
wash. Dry the last bit of ethanol from the pellet in the hood. (Do not speed vacuum). 
Suspend pellet in DEPC H20 at lug/ul concentration. 

Clean up total RNA using Qiagen's RNeasy kit 
10 [265] Add no more than lOOug to an RNeasy column. Adjust sample to a 

volume of lOOul with RNase-free water. Add 350ul Buffer RLT then 250ul ethanol (100%) 
to the sample. Mix by pipetting (do not centrifuge) then apply sample to an RNeasy mini 
spin column. Centrifuge for 15 sec at >10,000rpm. If concerned about yield, re-apply 
flowthrough to column and centrifuge again. 
15 [266] Transfer column to a new 2-ml collection tube. Add 500ul Buffer RPE 

and centrifuge for 15 sec at >10,000rpm. Discard flowthrough. Add 500ul Buffer RPE and 
centrifuge for 15 sec at >10,000rpm. Discard flowthrough then centrifuge for 2 min at 
maYiTrmm speed to dry column membrane. Transfer column to a new 1 .5-ml collection tube 
and apply 30-50ul of RNase-free water directly onto column membrane. Centrifuge 1 min at 
20 >10,000rpm. Repeat elution. 

[267] Take absorbance reading. If necessary, ethanol precipitate with 
ammonium acetate and 2.5X volume 100% ethanol. 

[268] Make cDNA using Gibco's "Superscript Choice System for cDNA 

25 Synthesis" kit 

First Strand cDNA Synthesis 
[269J Use 5ug of total RNA or lug of polyA+ mRNA as starting material. 
For total RNA, use 2ul of Superscript RT. For polyA+ mRNA, use lul of Superscript RT. 
Final volume of first strand synthesis mix is 20ul. RNA must be in a volume no greater than 
30 1 Oul. Incubate RNA with lul of 1 OOpmol T7-T24 oligo for 10 min at 70C. On ice, add 7 ul 
of: 4ul5Xlst Strand Buffer, 2ulof 0.1MDTT, and lul of lOmMdNTP mix. Incubate at 
37C for 2 min then add Superscript RT 
Incubate at 37C for 1 hour. 
Second Strand Synthesis 
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Place 1st strand reactions on ice. 



Add: 91ulDEPCH20 



30ul 5X 2nd Strand Buffer 



3ul lOmM dNTP mix 



5 



lul 10UM Exoli DNA Ligase 
4ul 10UM E.coli DNA Polymerase 
lul2U/ulRNaseH 



[270] Make the above into a mix if there are more than 2 samples. Mix and 
10 incubate 2 hours at 16C. 

[271] Add 2ul T4 DNA Polymerase. Incubate 5 min at 16C. Add lOul of 

0.5MEDTA 



mix to PLG tube. Add equal volume of phenol:chlorofonn:isamyl alcohol and shake 
vigorously (do not vortex). Centrifuge 5 minutes at maximum speed. Transfer top aqueous 
20 solution to a new tube. Ethanol precipitate: add 7.5X 5M NH40ac and 2.5X volume of 

100%ethanol. Centrifuge immediately at room temp, for 20 min, maximum speed. Remove 
sup then wash pellet 2X with cold 80% ethanol. Remove as much ethanol wash as possible 
then let pellet air dry. Resuspend pellet in 3ul RNase-free water. 

25 In vitro Transcription (TVT) and labeling with biotin 



15 



[272] 
[273] 

Phase-Lock gel tubes: 



Clean up cDNA 

Phenol:Chlorofonn:Isoamyl Alcohol (25 :24: 1) purification using 



[274] Centrifuge PLG tubes for 30 sec at maximum speed. Transfer cDNA 



Pipet 1.5ul of cDNA into a thin-wall PCR tube. 



30 



Make NTP labeling mix: 

Combine at room temperature: 2ul T7 1 OxATP (75mM) (Ambion) 
2ul T7 lOxGTP (75mM) (Ambion) 
1.5ul T7 lOxCTP (75mM) (Ambion) 
1.5ul T7 lOxUTP (75mM) (Ambion) 

3.75ul lOmM Bio-1 l-UTP (Boehringer-Mannheim/Roche or Enzo) 
3.75ul 10mMBio-16-CTP(Enzo) 
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2ul lOx T7 transcription buffer (Ambion) 
2ul lOx T7 enzyme mix (Ambion) 

[275] Final volume of total reaction is 20ul. Incubate 6 hours at 37C in a 

5 PCR machine. 

» 

RNeasy clean-up.of IVT product 
[2761 Follow previous instructions for RNeasy columns or refer to Qiagen's 
RNeasy protocol handbook. 

10 

[2771 cRNA will most likely need to be ethanol precipitated. Resuspend in 
a volume compatible with the fragmentation step. 

Fragmentation 

15 [278] 15 ug of labeled RNA is usually fragmented. Tiy to minimize the 

fragmentation reaction volume; a 10 ul volume is recommended but 20 ul is all right Do not 
go higher than 20 ul because the magnesium in the fragmentation buffer contributes to 
precipitation in the hybridization buffer. 

[279] Fragment RNA by incubation at 94 C for 35 minutes in 1 x 

20 Fragmentation buffer. 

5 x Fragmentation buffer: 
200 mM Tris-acetate, pH 8. 1 
500mMKOAc 
25 150mMMgOAc 

[280] The labeled RNA transcript can be analyzed before and after 
fragmentation. Samples can be heated to 65C for 15 minutes and electrophoresed on 1% 
agarose/TBE gels to get an approximate idea of the transcript size range 



30 



Hybridization 

[281] 200 ul(lOugcRNA) of a hybridization mix is put on the chip. If 
multiple hybridizations are to be done (such as cycling through a 5 chip set), then it is 
recommended that an initial hybridization mix of 300 ul or more be made. 
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Hybrization Mix: fragment labeled RNA (50ng/ul final cone.) 
50 pM 948-b control oligo 
1.5pMBioB 
5 SpMBioC 
25 pM BioD 
lOOpMCRE 

0. lmg/ml herring sperm DNA 
0.5mg/ml acetylated BSA 
10 to 300 ul with lxMEShyb. buffer 

[282] The instruction manuals for the products used herein are incorporated 
herein in their entirety. 

15 Labeling Protocol Provided Herein 

Hybridization reaction: 

Start with non-biotinylated IVT (purified by RNeasy columns) 
(see example 1 for steps from tissue to IVT) 
IVT antisense RNA; 4\ig: jxl 
20 Random Hexamers (1 |ig/fil): 4 jil 

H20: jJ 



14jil 

25 - Incubate 70°C, 10 min. Put on ice. 

Reverse transcription: 

5X First Strand (BRL) buffer: 6 pi 



0.1MDTT: 3yl 
30 SOXdNTPmix: 0.6 pi 

H20: 2.4 ul 

Cy3 or Cy5 dUTP (ImM): 3 \d 
SS RT II (BRL): 1 jil 

16 Hi 
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- Add to hybridization reaction. 

- Incubate 30 min., 42°C. 

- Add 1 |il SSII and let go for another hour. 
Put on ice. 

- SOX dNTP mix (25mM of cold dATP, dCTP, and dGTP, lOmM of dTTP: 25 
|il each of lOOmM dATP, dCTP, and dGTP; 10 pi of lOOmM dTTP to 15 \A H20. dNTPs 
from Pharmacia) 

RNA degradation: 

86plH20 

- Add 1.5 pi 1M NaOH/ 2mM EDTA, incubate at 65°C, 10 min. 
10 pi lONNaOH 

4 pi 50mM EDTA 
U-Con 30 

500 pi TE/sample spin at 7000g for 10 min, save flow through for purification 
Oiagen purification: 

-suspend u-con recovered material in 500pl buffer PB 
-proceed w/ normal Qiagen protocol 
DNAse digest: 

- Add 1 ^1 of 1/100 dil of DNAse/30^1 Rx and incubate at 37°C for 15 min. 
-5 min 95°C to denature enzyme 

Sample preparation: 

-Add: 

Cot-1 DNA: 10 pi 

SOX dNTPs: 1 pi 

Na pyro phosphate: 7.5 pi 

lOmg/ml Herring speim DNA lul of 1/10 dilution 

21.8 final vol. 

- Dry down in speed vac. 

- Resuspend in 15 H20. 
-Add0.38jdlO%SDS. 
-Heat95°C,2min. 
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- Slow cool at room temp, for 20 min. 

Put on slide and hybridize overnight at 64°C. 

5 Washing after the hybridization: 

3X SSC/0.03% SDS: 2 min. 37.5 ml 20X SSC-K).75ml 10% SDS in 

250ml H20 

IX SSC: 5 min. 12.5 ml 20X SSC in 250ml H20 

0.2X SSC: 5 min. 2.5 ml 20X SSC in 250ml H20 

10 Dry slides in centrifuge, 1000 RPM, lmia 

[283] Scan using appropriate Photomultiplier tube (PMT) and fluorescent 
excitation and emission channels. 

[284] The results are shown in Table 1 and Table 2. The lists of genes come 
from colorectal tumors from a variety of stages of the disease. The genes that are up 
1 5 regulated in the tumors (overall) were also found to be expressed at a limited amount or not at 
all in the body map. The body map consists of at least 28 tissue types, including Adrenal 
Gland, Bladder, Bone Marrow, Brain, Breast, Cervix, Colon, Diaphragm, Heart, Kidney, 
Liver, Lung, Lymph Node, Muscle, Pancreas, Prostate, Rectum, Salivary Gland, Skin, Small 
Intestine, Spinal Cord, Spleen, Stomach, Testis, Thymus, Thyroid Trachea and Uterus. As 
20 indicated, some of the Accession numbers include expression sequence tags (ESTs). Thus, in 
one embodiment herein, genes within an expression profile, also termed expression profile 
genes, include ESTs and are not necessarily full length. 

[285] Table 1 shows Accession numbers for 1747 genes upregulated in colon 
tumor tissue. The table provides the exemplar accession numbers, Unigene ID numbers, 
25 unique Eos codes, descriptions of the genes encoded, and relative amount of expression as 
compared with expression in other normal body tissue. 
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TABLE L GENES INVOLVED IN COLORECTAL CANCER 



PKey Primekey(unique probeset identifier) 
Ex. Accn. Exemplar accession number 
Probeset Eos Code number 
Unigene# Unigene number 



Pkey Probeset Ex Accn Ungg ID UnlGene Title Ratio TumMet/Bodv 

332264 EOS32195 M72849 Hs.1 15263 epireguBn 17.6 

332716 EOS32647 L0OO58 Hs.79070 v-rnyc avian myeiocytomatosla viral oncogene homoiog 15.0 

312845 EOS12776 AI911215 Hs.186555 ESTs 14.3 

310257 EOS101B8 AW389247 Hs.148826 ESTs 11.6 

322567 EOS22498 AF155108 EST duster (not in UniGene) 11.5 

331060 EOS30991 N75Q81 Hs.21648 ESTs 10.3 • 

322303 EOS22234 W07459 EST duster (net in UniGene) 9.6 

301B91 EOS01822 AF131855 Hs.106127 Homo sapiens done 25056 mRNA sequence £5 

318524 EOS18455 AW291511 Hs^53687 ESTs 8.9 

314001 EOS13932 AW168495 Hs.8750 ESTs 7.8 

331183 EOS31114 T40769 Hs.8469 EST 7.3 

315429 EOS15360 AW009951 HsJ06892 ESTs 7.3 

303344 EOS03275 AA255977 Hs.250646 ESTs; Highly similar to ubtqulfevconjugating enzyme [M.muscutus] 6.7 

313625 EOS13556 AW468402 rk254020 ESTs 6.7 

307084 EOS07015 All 60527 EST singleton (not in UniGene) with exon hit 6.1 

314943 EOS14874 A1476797 Hs.1 84572 cell division cyde 2; G1 to Sand G2 to M 6.1 

303753 EOS03684 AW503733 Hs.170315 ESTs 5,7 

31S593 EOS15524 AW198103 Hs.158154 ESTs 5.3 

313604 EOS13535 A1745325 Hs.182286 ESTs; Moderately simDar to UN ALU SUBFAMILY SB2 WARNING ENTRY I!!! [H^aptens] 5.1 

312319 EOS12250 AA216698 Hs.1 80780 Homo sapiens agrin precursor mRNA; partial ofe 5.1 

312614 EOS12545 A1766732 Hs^01194 ESTs 4.8 

323176 EOS23107 AW071648 Hs.123199 ESTs 4.8 

317916 EOS17B47 A1565071 Hs.159983 ESTs 4.7 

301846 EOS01777 R20002 Hs.6823 ESTs; Weakly similar to intrinsic factor-B12 receptor precursor pisapiens] 4.6 

311157 EOS11088 A1990122 Hs.196988 ESTs 4.6 

332640 EOS32571 AA417152 Hs.5101 protein regulator of cytokinesis 1 4.6 

311728 EOS11659 AW083000 Hs.184776 riboscmal protein L23a 4.5 

313774 EOS13705 AW136836 Hs.144583 ESTs 4.5 

312339 EOS12270 AA524394 EST cluster (not in UniGene) 4.4 

315369 EOS15300 AA764918 Hs.256531 ESTs 4.3 

303756 EOS03687 AI738468 Hs.1 15838 ESTs 4.3 

301050 EOS00981 AW136973 Hs.144475 ESTs; WeaHy similar to mitogen inducible gene rrdg-2 [H^apbns] 4.3 

300319 EOS0Q250 AW157646 Hs.153506 ESTs; WeaWy similar to rnicrotubute-acfin crossBnking factor [MinusculusJ 4.3 

300664 EOS00595 A1444628 Hs.256809 ESTs 4.3 

302655 EOS02586 AJ227892 ESTduster (not in UniGene) wilh exon hit 4.1 

315175 EOS15106 AI025B42 Hs.152530 ESTs 4.1 

330786 EOS30717 D60374 Hs.258712 EST 4.1 

310875 EOS10806 T47764 Hs.1 32917 ESTs 4.1 

313425 EOS13356 AA745689 Hs.1 86838 ESTs; Weakly similar to similar to zinc finger 5 protein from Gallus gallus; U51640 [Ksapiens] 4.0 

301804 EOS01735 AA581004 EST cluster (not in UniGene) with exon hit 4.0 

332203 EOS32134 H49388 Hs.102082 EST 3.9 

322968 EOS22899 AI905228 EST cluster (not in UniGene) 3.8 

321524 EOS21455 N79126 EST duster (not in UniGene) 3.6 

302476 EOS02407 AF182294 EST duster (not in UniGene) with exon hit 3.6 

303295 EOS03226 AA205625 Hs.208067 ESTs 3.6 

310016 EOS09947 AW449612 Hs.152475 ESTs 3.7 

324871 EOS24802 AW297755 Hs.148832 ESTs 3.7 

322887 EOS22818 AI986306 Hs.23346D ESTs; Weakly simibr to KIAA0369 protein [rlsapiens] 3.7 

313171 EOS13102 N67879 Hs.157695 ESTs 3.7 

321638 EOS21569 AI356352 Hs.108932 ESTs 3.7 

320445 EOS20376 R33916 EST duster (not in UniGene) 3.6 

302149 EOS02080 AI383794 Hs.1 52337 protein argWne fknethyitransferffiB 3(hnRNP methyltransferase S. cerevisiae}-like 3 3.6 

316905 EOS16836 AW138241 Hs^10846 ESTs 3.6 

313166 EOS13097 AI801098 Hs.151500 ESTs 3.6 

323338 EOS23269 R74219 r&23348 S^hase kinase^ssociated protein 2 (p45) 3.5 

311434 EOS11365 AW016607 Hs.201582 ESTs 3.5 

312742 EOS12673 AK50363 Hs.116462 ESTs 3.4 

323587 EOS23518 AI905527 Hs.141901 ESTs; Moderately simDar to 1111 ALU SUBFAMILY SP WARNING ENTRY llil [H.sapisns] 3.4 

317390 EOS17321 AW136S51 Hs.181245 ESTs 3.4 

315282 EOS15213 AI222165 Hs.144923 ESTs 14 

318565 EOS18498 AI440137 Hs.164989 ESTs 3.4 

307586 EOS07517 AI285499 EST singleton (not in UniGene) with exon hit 3.4 

321052 EOS20983 AW372884 Hs.240770 nudear cap binding protein subuntt 2; 20kD 3.3 

324338 EOS24269 AL138357 Hs.247514 ESTs 3.3 

307517 EOS07448 A1275055 Hs.164989 ESTs 13 

314852 EOS14783 A1903735 Hs.137527 ESTs; Weakly similar to X-linked retinopathy protein [H^apiens] 3.3 

324657 EOS2458B AW451142 Hs.255628 ESTs 3.2 

314912 EOS14843 AI431345 Hs.161784 ESTs 3^ 

324790 EOS24721 AI334367 Hs.159337 ESTs U 

315498 EOS15429 AA628539 Hs.116252 ESTs; Moderately simDar to III! ALU SUBFAMILY J WARNING ENTRY 111! [H-sapiens] 3.2 

312857 EOS12788 AA772279 Hs.128914 ESTs Z2 
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300762 EOS00693 AI497778 Hs.168053 ESTs 3.2 
325587 EOS25518 c12_hs gf|66824S2|reil gn 1 +126724 126967 ex 77 CDSJ Z44 244 3099 

CH.12_hs #682462 3.2 

320654 EOS20585 AW263086 Hs.1 1811 2 ESTe 12 

5 316715 E0816646 AI440266 Hs.170B73 ESTs 3.1 
333279 EOS33210 CH22_522FG_126_LUNK^atAC005500.GENSCAN.8-1 

CH2*.FGENES.126_1 3,1 

309689 EOS09620 AW236171 Ha.181357 laminin receptor 1 (67&D; ribosomal protein SA) 3.1 

323846 EOS23777 AA337621 Hs.137635 ESTs 3.1 

10 324676 EOS24609 AI990739 Hs.236511 ESTs;IModerate!y6lrnBartoRNAsplidngHBJa^ 3.1 

306362 EOS08293 AI613519 EST singleton (not In UniGene) with exon hit 3.1 

308615 EOS08546 AI738593 EST singleton (not in UniGene) with exon hit 3.0 

315397 EOS15328 AA218940 Hs.137516 ESTs 3.0 

302236 EOS02167 AI128606 Hs.167558 zinc finger protein 161 3.0 

15 321693 EOS21624 AA700017 Hs.173737 ras-mlated C3 Botulinum toxin substrate 1 (rho famly; small GTP binding protein Rac1) 3.0 

330814 EOS30745 AA015730 Hs.247277 ESTs; Weakly similar to btrosforniafior^^ 3.0 

302977 EOS02908 AW263124 EST cluster (not in UniGene) with exon hit 3.0 

327516 EOS27447 c_2_hs gi|6117815|refign 6 +199078 199216 ex 4 4 COS! 9.15139 1551 

CK02JiSflIl6117815 2.9 

20 333278 EOS33209 C^2^521FG_125JLUr4KJE(ytAC005500.GENSCAN.7-2 

CH22J : GENES.125_2 2-9 

302088 EOS02019 U77629 Hs. 135639 achaete-scute complex (Drasophita) homoiog-like 2 2.9 

322718 EOS22649 AF150270 Hs.233322 ESTs; WeaHy similar to cDNA EST EMBLiTO1 156 comes from this gene [CXelegans] Z9 
329154 EOS29085 c_x_hs gjj58686B6lrefl gn 2- 200851 201356 ex 1 3 CDSI 30.28 506 1812 

25 CHXJ«gi|58686B6 29 

315978 EOS15909 AA830893 Hs.1 19769 ESTs 2.9 

302677 EOS0260B H63227 Hs.132B80 ESTs; Highly simiiarto ubiqulfln-conjugafing enzyme (MjtiuscuIus] 29 

315007 EOS14938 A1806583 Hs.125291 ESTs - 2.9 

303780 EOS03711 A1424014 Hs^43450 ESTs; Moderately sirrflar to WAAQ456 protein [Usapjens] 29 

30 331362 EOS31293 AA417956 Hs.40782 ESTs 2.9 
335815 EOS35746 O^3187FG_618J_UNK^EJ^C005500.GENSCAN.51(W 

CH22_FGENES.618_3 2.8 

332070 EOS32001 AA598545 Hs.228138 EST 2.8 

31572) EOS15651 AW291875 Hs.163900 ESTs 2.8 

35 311913 EOS11844 A1358522 Hs.221417 ESTs 2.8 

331014 EOS30945 K98597 Hs.30340 ESTs 2.8 

322035 EOS21966 AL137517 EST duster (not m UniGene) 2.8 

338057 EOS37988 CH22_6558FG_UNK.EKtAC00550aGENSCAN.160-1 

CH22L^M^0550aGENSCAN.160-1 2.8 

40 335B29 EOS35760 CH22^3202FG_620_3_UNK_E\tACC05500.GENSCAN.512-3 

CH22_FGENES.620 3 Z8 

312136 EOS12067 AW451469 Hb.209990 ESTs Z8 

303132 EOS03063 A1929819 Hs.193330 ESTs 2.8 

317548 EOS17479 AI654187 Hs.195704 ESTs 28 

45 325585 EOS25516 d 2> gjJ6682462iretl gn 1 ♦ 73476 73574 ex 5 7 COSi 8.5299309 

7 CH.12_hsgi|6682462 2.7 

334631 EOS34562 CH22_1939FG_416_7_UNK_EMAC005500.GENSCAN.277-7 

CH22>FG£NEa416_7 2.7 
329156 EOS29087 cjLhs gi)5868686|refl gn 2 -202013 202341 ex 33 CDSf 10.23 329 1814 

50 CHJehsgii5868686 2.7 

318615 EOS18546 A1133617 Hs.191088 ESTs 2.7 

300734 EOS00665 AW205197 Hs^40951 ESTs 2.7 

324430 EOS24361 AA464018 EST cluster (not in UniGene) 2.7 

322296 EOS22227 W76326 HsJ51937 ESTs 2.7 

55 303842 EOS03773 AI337304 Hs. 126268 ESTs; Weakly similar to similar to PDZ domain [Celegans] 27 

320909 EOS20840 D62269 EST duster (not in UniGene) 27 

325195 EOS25126 T2Q25B Hs.171443 ESTs; Weakly slmflar to acfin binding protein MAYVEN [Rsapiens] 27 

324959 EOS24890 AW367745 Hs.1 431 37 ESTs 27 

309997 EOS09928 A1291621 Hs.145199 ESTs 27 

60 329367 EOS29298 c_x_hs gl|5868842jrei} gn 1 - 87201 87587 ex 1 4 CDSI 8.13 387 3908 

CKXJisgi|5868842 27 

316697 EOS16628 AW293174 Hs.252627 ESTs 27 

313600 EOS13531 AA429564 Hs.185802 ESTs 27 

301471 EOS01402 AA995014 Hs.129544 ESTs; WeaHy similar to ORF YLL027w (S.cerevisiae) 26 

65 300810 EOS00741 A1076890 Hs.1 86949 ESTs 26 

319976 EOS19907 N48809 Hs260824 ESTs 26 

313434 EOS13365 W92070 Hs-231902 ESTs 26 
333849 EOS33780 CH22.1118FG_29OJ - UNrv.ENtAC005500.GENSCAN.146.7 

CH22_FGENES.29D_8 26 

70 330744 EOS30675 AA406142 Hs.1 2393 cTDP-D^ucose 4;Wehyoratase 26 

309398 EOS09329 AW081820 EST singleton (not in UniGene) with exon hit 26 

338727 EOS38658 CH22J523FG_UNK w E\tAC005500.GENSCAN^)0-2 

CH22_EMAC005500.GENSCAN.500-2 26 

324620 EOS24551 AA448021 EST cluster (not in UniGene) 26 

75 335755 EOS356B6 CH22_3122FG_604_4_UNK^EMAC005500.GENSCAR493-9 

CH2i.FGENES.604J 26 

315858 EOS15789 AA737345 EST cluster (not in UniGene) 26 

307288 EOS07219 A1205169 EST singleton (not in UniGene) with exon hit 25 

330542 EOS30473 U23942 Hs.226213 cytochrome P450; 51 (tanosterol 14-a!pha4emethytase) 25 

80 335896 EOS35827 CH22_3273FG^635_4_UNK_EM^C0055D0.GENSCAN.52^6 

CH2^FGENES.635.4 25 

316578 EOS16509 AA775623 Hs.211683 ESTs 25 
329193 EOS29124 c_3L_hs gi|5B6B71 6|ref] gn 3 + 168095 168181 ex 9 9 CDSI -1.11 87 2064 

CHXJJsgI|5868716 25 

85 315193 EOS15124 AI241331 Hs.131765 ESTs 25 

319478 EOS19409 R06841 EST cluster (not in UniGene) 25 
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334727 EOS34658 CH22»2038F6_424_1_UNK^EMAC005500.GENSCAN.28M 

CH22_FGENES.424J 2.5 

32B113 EOS28044 cjja pj|5868024|ref| gn 2 - 80378 80491 ex 23 CDSI 3,891143247 

CH.06 fts 0(5868024 2.5 

315214 EOS15145 AI915927 Hs.34771 ESTs" 25 

324718 EOS24649 AI557019 Hs.1 16467 ESTs 25 

313328 EOS13257 AI088120 Hs.122329 ESTs 2.5 

319480 EOS19411 R06933 Hs.184221 ESTs 25 

317902 EOS17833 AI828602 Hs.211265 ESTs 2.5 

323341 EOS23272 AL134B75 Hs.192386 ESTs 2.5 

338003 EOS35934 CH22.3385FG_664„4_UNKJU32liaGENSCAN54 

CH22_FGENES.664J 25 

322992 EOS22923 AA142B91 Hs.193165 ESTs 25 

314911 EOS14842 AW292329 Hs.163481 ESTs 25 

313603 EOS13534 AW468119 EST duster (not In UniGsna) 2.5 

306469 EOS06400 AA983792 EST singleton (not in UniGsna) with axon hit 2.5 

324715 EOS24646 AI739168 EST cluster (not In UniGane) 2.5 

302455 EOS023B6 AA356923 Hs^40770 nuciear cap binding protein subunii 2; 20XD 2.4 

321023 EOS20954 H25135 Hs.125608 ESTs 24 

302099 EOS02030 AL021397 Hs.1 37576 ribosomal protein L34 pseudogena 1 24 

314092 EOS14023 AI984040 Hs.226946 ESTs 24 

316587 EOS18518 AA779704 Hs.168830 ESTs 24 

303702 EOS03633 AW500748 Hs.224961 ESTs; Weakly similar to 73 kDA subunit of cleavage and pclyadonyiaflon spaciftcity factor [H.sapiens] 24 

301622 EOS01753 X17033 Hs.1142 integrin; alpha 2 (CD49B; alpha 2 subunit of VLA-2 receptor) 2.4 

322694 EOS22625 AI110872 EST cluster (not In UntGene) 24 

323333 EOS23264 AA228883 EST cluster (not In UnlGene) 24 

301954 EOS01885 AJ009936 Hs.118138 nudaar receptor subfamily 1; group I; member 2 4 2.4 

331363 EOS31294 AA421562 Hs.91011 anterior grafent 2 (Xenepus laews) homoiog 24 

303811 EOS03742 AW182340 Ks.246155 ESTs; Weakly simitar to DNA TOPOISOMERASE i [H^aplens] 24 

306243 EOS08174 A1560037 EST singleton (not In UnlGene) with exonhH 24 

336021 EOS35952 CH22_34O4FG_669J0.UNreW32l10.GENSCAN.9-15 

CH22_FGENES.669_10 24 

334789 EOS34720 CH22.2101 FG_432^14_UNK_£KtAC00550aG£NSCAN-S3-1 7 

CH2Z_FGENES.432J4 24 

320807 EOS20738 AA086110 Hs.18B536 Homo sapiens done 24838 mRNA sequence 24 

328903 EOS28834 cj_tegi|5868514lref|gn 1 + 23625 24468 ex 3 5 CDS} 91.18 844 219 

CH.08J»sgl|5868514 24 

338759 EOS38690 CH2^7581FG__UNK_E^CX05500.(3=NSCAN.517^ 

<>t22_EIMCOu^500.GENSCAfi517-6 23 

333769 EOS33700 CH22_1036FG^71_8_UMieEWtAC005500.GENSCAN.127-8 

CH22_FGENES.271_8 23 

303597 EOS03528 AT792141 Hs. 143560 ESTs; Weakly similar to brain mitochondrial carrier protein-1 |Hsap«ns] 23 

305898 EOS05829 AA872838 Hs.242463 keratin 8 23 

304439 EOS04370 AA396882 EST singleton (not hUniGene) with exon hit 23 

301604 EOS01535 AA373124 Hs.105837 ESTs; Weakly similar to C17G10.1 [Cetegans] 23 

315071 EOS15002 AA5526S0 Hs.152423 ESTs 23 

330565 EOS3D4S6 U51095 Hs.1545 caudal ^horneolxatrawwiptiOT factorl 23 

331569 EOS31520 N71027 Hs.41856 ESTs - 23 

303216 EOS03147 AA581439 Hs.152328 ESTs 23 

324988 EOS24919 T06997 EST cluster (not in UnlGene) 23 

312996 EOS12927 AA249018 EST cluster (not in UnlGene) 23 

532314 EOS32245 725862 Hs.101774 ESTs 23 

313325 EOS13256 AI420611 Hs.127832 ESTs 23 

322991 EOS22922 C18965 Hs.159473 ESTs 23 

335498 EOS35A27 CH22jmFG_571JJ)NK.EM*C0Q55W.GEmM.46Q-25 

CH22_FGENES^71 4 23 

315135 EOS15066 AA627561 Hs.192446 ESTs 23 

319488 EOS19419 AW250340 EST cluster (not in UnlGene) 23 

323571 EOS23502 AA984133 Hs.1 53260 c-OWrrteracling protein 23 

322826 EOS22757 AI807683 Hs.156932 ESTs " 23 

322221 EOS22152 AI890619 Hs.179662 nucieosome assembly protein 1«!3<e 1 23 

312242 EOS12173 AI380207 Hs.125276 ESTs 23 

315238 EOS15169 AA593867 Hs.170890 ESTs 23 

315168 EOS15099 AA622130 Hs.152524 ESTs 23 

300504 EOS00435 AW204624 Hs.1 92927 ESTs; Weakly similar to Urn kinase [H .sapiens] 23 

323243 EOS23174 W44372 EST cluster (not in UnlGene) 23 

331628 EOS31559 R80965 Hs.204079 ESTs 23 

320746 EOS20677 AA128302 EST duster (not in UnlGene) 23 

324598 ■ EOS24529 AA502659 Hs.163986 ESTs 23 

308667 EOS08598 A1758754 EST singleton (not in UnlGene) with exon hit 22 

302944 EOS02875 AA340708 Hs.256204 ESTs; Weakly similar to cyclic nucteotide-gated channel beta subunit [Rnorvegicus] 22 

316291 EOS16222 AW375974 Hs.156704 ESTs 22 

315296 EOS15227 AA876905 Hs.125286 ESTs 22 

334150 EOS34081 CH22_1429FG_339J.UNK_E^C00^50aGENSCAN.189-1 

CH22_FGENES.339J 22 

331380 EOS31311 AA453266 Hs246131 ESTs 22 

321795 EOS21726 AI796696 Hs.222446 ESTs 22 

331493 EOS31424 N34357 Hs.44571 ESTs 22 

312890 EOS12821 AI813654 Hs.127478 ESTs 22 

3155B3 EOS15514 AW003622 Hs.126555 ESTs 22 

314306 EOS14237 AI697901 Hs.192425 ESTs 22 . 

314138 EOS14069 AA740616 EST cluster (not in UniGene) 22 

302656 EOS02587 AW293005 Hs.220905 ESTs 22 

313564 EOS13495 AA810141 Hs.192182 ESTs 22 

332792 EOS32723 CH22JFGj_2_UNK^C4G1.GENSCAN.3-2 

CH22_FGENES.3_2 22 
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332020 EOS31951 AA488895 Hs.105219 ESTs 2.2 

315143 EOS15074 AA878324 Hs.192734 ESTs 2.2 

313385 EOS13316 AI0320B7 Hs.176711 ESTs 2.2 

323835 EQS23766 AL042005 EST duster (ftol in UmGene) 22 

314014 EOS13945 AW291B47 Hs.121715 ESTs; Weakly similar to HP proton [H-sapiens] 2.2 

336016 EOS35947 CH2^_3399FG.669„5_UNK-DJ32I10.GENSCAM9-10 

CH22^FGENES.669_5 2.2 

323218 EOS23149 AF131846 Hs.13395 Homo sapiens done 25028 mRNA sequence 2.2 

338059 EOS37990 CH2^6561FG_UNK.EM:AC005S00.GENSCAN.1604 

CH22_Ei^AC005500.GENSCAN.1604 2.2 

302613 EOS02544 AA371059 Hs.251636 ubiquftln specific protease 3 2.2 

304652 EOS04783 AA588595 EST singleton (not In UniGene) with exon hit 22 

308457 EOS08388 A/669859 EST singleton (not In UniGene) with exon hit 12 

311736 EOS116S7 AA765897 EST duster (not In UniGene) 12 

334183 EOS34114 Cr^1464FO.350J3JJNrC^C005500.GENSCANmi6 

CH22JGENES.350J3 12 

315021 EOS14952 AA533447 EST cluster (not in UniGene) 12 

303013 EOS02944 F07898 Hs.214190 interteukin enhancer binding factor 1 2.2 

315006 EOS14937 AI53B613 Hs.135657 ESTs 2.2 

337534 EOS37465 CH22_5803FG_828J_ CH22 w FGENES.828-3 22 

303276 EOSQ3207 AA431599 Hs.132799 ESTs 2.1 

318617 EOS18548 AW247252 Hs.75514 nucleoside phosphorylase 2.1 

330760 EOS30691 AA448663 Hs-30469 ESTs 2.1 

31S545 EOS19476 R83716 Hs.14355 ESTs 21 

312252 EOS12163 A1128388 Hs.143655 ESTs 2.1 

322882 EOS22813 AW248508 Hs^491 DiGeorga syndrome critical region gene 2 2.1 

312684 EOS12615 AW294020 Hs.117721 ESTs 21 

315782 EOS15713 AW515455 Hs.115558 ESTs; Weakly similar to llli ALU SUBFAMILY J WARNING ENTRY HQ [Rsaptens] 2.1 

320076 EOS20007 AI653733 Hs.204079 ESTs 2.1 

300566 EOS00497 H86709 Hs.21371 son of seveniess {DrosopMa) homotog 1 2.1 

300908 EOS00839 AA618335 Hs.146137 ESTs; Weakly similar to putative [Ceieoans] 21 

314778 EOS14709 AW079559 Hs.152258 ESTs 21 

319233 EOS19164 R21054 Hs.211522 ESTs 2.1 

335488 EOS35419 CH22^840FG_570_20JJNK w E\tAC005500.GENSCAN .460-1 5 

CH22J=GENES.570_2Q 21 

334816 EOS34547 CH22J 923FG.41 1_1 5_UMrCEM^C005500.GENSCAN^74-22 

CH22_FGENES.411_15 21 

306792 EOS06723 AI042426 EST singleton (not In UniGene) with exon hit 21 

301661 EOS01592 AI815558 EST duster (not in UniGene) with exon hit 21 

311332 EOS11263 AW292247 H&255052 ESTs 21 

314785 EOS14716 AI538226 Hs.135184 ESTs 21 

301460 EOS01391 AW196758 Hs.165998 DKFZP564M2423 protein 21 

332015 EOS31946 AA487910 Hs.208800 ESTs; Weakly similar to !i!l ALU CLASS B WARNING ENTRY !!!! [H.6apiens] 21 

321529 EOS21460 A1269506 Hs.146056 ESTs 21 

323740 EOS23671 AA324643 Hs.246106 ESTs 2.1 

336019 EOS35950 CH22«34O2FG„669.8_UNieDJ32J10.GENSCAN.9.13 

CH22_FGENES.669 8 21 

314954 EOS14885 AA521381 Hs.187726 ESTs 21 

303037 EOSQ296B AF118395 EST duster (not in UniGene) with exon hit 21 

302056 EOS01S87 A1457532 Hs.126082 ESTs; Moderately slmlarto ROSA26AS [M.musculus] 21 

315178 EOS15109 AW362945 Hs.162459 ESTs 21 

332246 EOS32177 N57927 Hs.120777 ESTs; Weakly similar to RNA POLYMERASE II ELONGATION FACTOR ELL2 [Rsapiens] 20 

334288 EOS34219 CH22_1577FG_369_18JUN)eEMAC005500.GENSCAN.229-18 

CHZLFGENES.369J8 20 

324690 EOS24621 N86286 Hs.132808 ESTs; Weakly similar to Similar to S.pombe -rad4+fcut5+product [H^apiens] 20 

305257 EOS05188 AA6790O5 EST singleton (not in UniGene) with exon hit 20 

311315 EOS11246 AW450536 Hs209260 ESTs 20 

311988 EOS11919 AW016096 Hs.13801 ESTs 20 

302638 EOS02569 AA463798 Hs.102695 ESTs; Weakty slmBar to C1 1 D24 [CetegansI 20 

320531 EOS20462 W03691 Hs248B4 ESTs; Moderately similar to RNA polymerase 1 associated factor [Mmusculus] 20 

323604 EOS23535 AI751438 Hs.182B27 ESTs; Weakly similar to iill ALU SUBFAMILY SQ WARNING ENTRY III! [H.sapfens] 20 

308852 EOS08783 AI829848 Hs.182937 pep«dylpiolyIisomeraseA(cydophilinA) 20 

320521 EOS20452 N31464 H&24743 ESTs 20 

331306 EOS31237 AA252079 Ha.63331 dachshund {Drosophifa) homotog 20 

314941 EOS14872 AA5159Q2 Hs.130650 ESTs 20 

336684 EOS36615 CH22_4167FGJ6_1_ CH22_FGENES.4$-1 20 

301137 EOS01068 AF049569 Hs.137096 ESTs 20 

338454 EOS38385 (^712BFG_UNK_EM^C00550aGENSCAN.36O4 

CH22_EMAC005500.GENSCAN.36O4 20 

309700 EOS09631 AW241170 Hs.179661 Homo sapiens clone 24703 beta-hibulin mRNA; complete cds 20 

330262 EOS30193 c_5j>2 gi|6671884lgb|Agn 1+67913 68053 ex 3 3 CDS! 5.41 141 597 

CR05_p2giI6571884 20 

324163 EOS24094 AUM6827 Hs.134651 ESTs 20 

316493 EOS16424 AA766142 Hs.131810 ESTs; Weakly similar to Iill ALU SUBFAMILY J WARNING ENTRY llli [tapiens] 20 

311873 EOS11804 AA730045 Hs,187866 ESTs 20 

326757 EOS26688 c20jjs gl|6249610|refl gn 3 +74531 74597 ex 1 3 CDSf 9.52671416 

CH.20Jisgi|6249610 20 

319167 EOS19098 F05984 Hs.250138 protein phosphatase 2C; magnesairn-dependent; catalytic subunit 20 

316011 EOS15942 AW516953 Hs.201372 ESTs 20 

313S35 EOS13566 AA507227 Hs.6390 ESTs 20 

310027 EOS09958 AW449009 Hs.126647 ESTs 20 

336662 EOS36593 CH22j*138FG_41J_ CH22JFGENES.41-1 20 

334648 EOS34579 C>l22^1956FGJ17J5.UNICEmcm00.GENSCANmi5 

CH22_FGENES.417J5 20 

30B676 EOS08607 AI761036 EST singleton (not In UniGene) with exon hit 20 

312047 EOS11976 AA588275 Hs.14258 ESTs 20 
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324826 EOS24757 M704806 Hs.143842 ESTs 2.0 

322889 EOS22820 AA081924 Hs.211417 ESTs 20 

316345 EOS16276 AW139408 Hs.152940 ESTs 2.0 

313922 EOS13853 AI702038 Hs.100057 ESTs 2.0 

5 319423 EOS19354 T83024 Hs.15119 ESTs 2.0 

320244 EOS20175 AA296922 Hs.129778 gastrointestinal peptide ZO 

308957 EOS08888 A1889642 EST singleton (not in UniGene) with axon hit 2.0 

334223 EOS34154 CH2?_1507FG_360XUNK_EiytAC005500.GENSCAN.2184 

, . CH22J : GENES.360_4 1.9 

10 302980 EOS02911 W93435 EST cluster (not In UnlGene) with exon hit 1.9 

312153 EOS12084 AA759250 Hs.153028 cytochrome W61 1.9 

326460 EOS26391 c19Jis glj5867400[mf] gn 3 - 142633 142935 ex 1 2 CDS! 19.03 303 1731 

CR19 hsgi|5B67400 1.9 

319962 EOS19893 H06350 Hs.135056 ESTs" 1.9 

15 307064 EOS0S995 AI149335 EST slngteton (not in UnlGene) with exon hit 1.9 

331608 EOS31539 N89661 Hs.44162 ESTs; Weakly similar to cDNA EST yk342h1 2.5 comes from this gene [OelegansJ V.9 

328142 EOS28073 c_6Jis gi]5868050|ref) gn 1 • 9656 9778 ex 2 6 CDSi 11 .1 1 123 3339 

CH.06 hsgi|5SS8050 1.9 

312527 EOS12458 AI695522 Hs.191271 ESTs 1.9 

20 318581 EOS1B512 AA769058 EST duster (not in UnlGene) 1.9 

319979 EOS19910 AB01B281 Hs. 107479 WAA0738 gene product 1.9 

336107 EOS36038 CH2^3496FGJ96XUNKJDA59H18.GENSCAN.4^ 

CH22_FGENES.696 3 1.9 

305232 EOS05163 AA670O52 Hs.1 95188 glyceraidehyd^phosphate dehydrogenase 1.9 

25 315043 EOS14974 AA806538 Hs.130732 ESTs 1.9 

323377 EOS2330B AA133260 Hs.8454 protBln kinase; cAMP-dependent regulatory; type tl; alpha 1.9 

338260 EOS3B191 CH22^6863FG_UNK^EM:AC005500.GENSCAN.279-10 

CH22_EKtAC005500.GENSCAN.279-1 0 1.9 

334891 EOS34822 CH22_22TCFGJ5^5jJNr^EMAC005500.GENSCAN.341-8 

30 CH22.FGENES.452J 1.9 

316055 EOS15986 AA693880 EST cluster (not in UnlGene) 1.9 

312414 EOS12345 AI915014 Hs.164235 ESTs; Weakly sknBar to Itll ALU SUBFAMILY J WARNING ENTRY Ml [Usapiens] 1.9 

300225 EOS00156 AI989963 Hs.1975Q5 ESTs 1.9 

332507 EOS32538 R41791 Hs.36566 UM doram kinase 1 1.9 

35 312405 EOS12336 AI523875 EST cluster (not In UnlGene) 1.9 

313805 EOS13536 AI761786 Hs.204674 ESTs 1.9 

337755 EOS37686 CH22_6105FG_UNK-EWAC000097.GENSCAN.109-2 

CH22LEMACO00O97.GENSCAN.1O9-2 1.9 

323216 EOS23147 AA332145 EST cluster (not in UniGene) 1.9 

40 334872 EOS34803 CH22J2186FGJ50JLUNK.EM^O)05500.GENSCAN.339-2 

CH22_FG£NES,450 2 1.9 

332034 EOS31985 AA489847 Hs.112019 ESTs; Moderately similar to 111! ALU SUBFAMILY J WARNING ENTRY till [Rsapiens] 1.9 

332103 EOS32034 AA6Q9161 Hs.112657 ESTs; Weakly similar to ORF YOR243c IS.cerevtslae] 1.9 

318196 EOS18127 AI056776 Hs.133397 ESTs 1.9 
45 329141 EOS29072 c_x_hs giI6O17060|ref| gn 1 + 343924 343997 ex 2 3 CDSi 8.53 74 1715 

OiX_nsgi|6017060 1.9 

321539 EOS21470 N98619 Hs.62461 ARP2 (artiiwelatBd protein 2; yeast) homoiog 1.9 

313881 EOS13812 AA535580 Hs.16331 ESTs 1.9 

314046 EOS13977 AWQ21917 Hs.1 81878 ESTs 1.9 
50 336045 EOS35976 CH22_3430FG_679_7_UNICDJ32J10.GENSCAN.1M 

CH2LFGENES.679_7 1.9 

324799 EOS24730 AW272262 Hs.250468 ESTs 1.9 

312656 EOS12587 AW152449 t&2264S9 ESTs 1.9 

324662 EOS24593 AW5046B9 EST duster (not in UniGene) 1.9 

55 323930 EOS23861 AA570698 Hs.193203 ESTs 1.9 

314465 EOS14396 AA602917 Hs.156974 ESTs 1.9 

335897 EOS35828 CH22_3274FG_635_5_UNK_EMAC00550aGENSCAN.525-7 

CH22 W FG£NES.635_5 1.9 

321746 EOS21677 AI806500 Hs.102652 ESTs; Weatty similar to K1AA0437 [Usapiens] 1.9 
60 335687 EOS35618 CH2^3048FG_596_2_U^BAAC005^aGENSCAN.488-2 

CH22_FGENES.596_2 1.9 

330731 EOS30662 AA278816 Hs.177204 ESTs 1.9 

315542 EOS15473 AA079476 Hs.1 09857 ESTs; Highly similar to CGW9 protein [H-saplens] 1.9 

336379 EOS35310 CH2?_3791FG_821.7JJNK_BA232E17.GENSCAN.4-19 

65 CH2*_FGENES.821_7 1.9 

305691 EOS05622 AA813590 Hs.1 19500 kaiyopherinatpha4Pmpon]nalpha3) 1.9 

310639 EOS10570 AW269082 Hs.175162 ESTs 1.9 

327481 EOS27412 cJUis gi]5867783|reflgn 3 ♦104472104673 ex 1 4 CDSf 14.332021308 

CH.02Jisgil5867783 1.9 

70 301910 EOS01841 TB4852 Hs.98370 cytochrome P540 family member predicted from ESTs 1.9 

335478 EOS35409 CH22_2830FG_569_1_UNlCEIAAO005500.GENSCAN,456-1 

CH22.FGENES.569J 1.9 

331135 EOS31066 R61398 Hs.4197 ESTs 1.9 

335690 EOS35621 CH2^3051 FG.596 J JJMeE^C0055M.GENSCAN.488-5 

75 CH22_FGENES.598J 1.9 

308047 EOS07978 AI45S633 EST singleton (not in UniGene) with exon hit 1.9 

334500 EOS34431 CH2SJ800FGJ97 16JJNfCEMAC005KW.GENSCAN^60-18 

CH22_FGENr&397J6 1.9 

338250 EOS38181 CH22_6B48FG_UNK^EMAC005500.GHNSCAN.26J- 

80 2 CH22_EfctAC005500.GENSCAN.269-2 1.8 

320618 EOS20549 A122Q276 H&23522B EST 1.8 

335044 EOS34975 CH22_2367FG_480.1_UNK_EM^C005500.GENSCAN.374-1 

CH22_FGENES.480J * 1.8 

313789 EOS13720 AJ167810 Hs^17743 ESTs 1.8 

85 311911 EOS11842 AI087123 Hs.114434 ESTs; Weakly similar to lit! ALU SUBFAMILY J WARNING ENTRY III! [H.saplens] 1.8 

320180 EOS20111 AA846203 Hs.193974 ESTs; Weakly similar to alternatively spliced product using exon 1 3A [Ksapiens] 1.8 
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311036 EOS10967 A1539227 Hs^14039 ESTs 1.B 

323303 EOS23B34 AA773680 Hs.193598 ESTs « 

318676 EOS18607 T57448 Hs.15467 ESTsiModaralalysintotapulallwphosphobosifitte 1.8 

303007 EOS02938 AA478876 Hs.7037 pallid (mouse) homclog; paQUIn 1-8 
334806 EOS34737 CH22J119FG_435J.UMK^MAC005500.GENSCAN.29W 

CH22.FGENES.435 7 1.8 

311767 E0811698 AI076686 Hs.190066 ESTs 1.8 

331750 EOS31681 AA284372 Hs.111471 ESTs 1.8 

314872 EOS14803 AI144254 Hs.239726 ESTs 1.8 

314071 EOS14002 AA192455 Hs.188S90 ESTs 1.8 
328450 EOS2B381 OJis gf|586B425|refl gn 2 - 209192 209321 ex 2 3 CDSI ia41 130 1407 

CH.07 hsgi|5B68425 1.8 
328857 EOS28788 c_7Jis g]}63B1927lrsf] gn 3 - 80557 81051 ex 1 1 CDSo 41.51 495 6080 

CH.07_hs 856381927 1.8 

313781 EOS13712 AA078836 EST duster (not in UniGene) 1.8 

336953 EOS36B84 CH22_4745FG_361_22_ CH22_FGENES.361-22 18 

300233 EOS00164 AI380777 Hs.189402 ESTs 16 
326862 EOS26793 c20J» #552465N 9" 2 + 107702 107782 ex 12 13 COSi 3.6281 2149 

CH.20 hsgl|6552465 18 

312364 EOS12295 R40111 Hs.187618 ESTs" 18 

321541 EOS21472 AI220292 Hs.254467 ESTs 18 

307432 EOS07363 AI244259 Hs.181165 eukaryobc translation elongation factor 1 alpha 1 1.8 

320921 EOS20852 R94038 Hs.199536 inhfcin; beta C 18 
333110 EOS33041 CH22_338FG_79J6_UNK>EIVtAC000097.GENSCAN.59-15 

CH22LFGENES.79J6 18 

324914 EOS24845 AA847510 Hs.161292 ESTs 18 

312681 EOS12612 AI028149 Hs.193124 pyruvate dehydrogenase kinase; tsoenzyme 3 18 
335697 EOS35628 C^22„3058FG_596J2_UNK_EMAC00550aGENSCAN.4B8-13 

CH22_FGENES.596J2 18 

308462 EOS08393 AI571311 EST singleton (not in UniGene) with exon hit U 

312138 BOSi2DS9 7B9405 Hs.218851 ESTs; WfeaWy similar to HI1ALU SUBFAMILY J WARNING ENTRY mftufim] 18 

309116 EOS09047 AI927149 Hs.29797 ribosornat protein L10 18 

320730 EOS20661 AA534539 Hs.151072 ESTs 18 

300844 EOS00775 ALQ42759 Hs.191762 ESTs 18 
337570 EOS37501 CH22_5856FG__UNK_C65E1.GENSCAN.4-2 

CH22_C6K1.GENSCAN.4-2 18 

332756 EOS32687 D63479 Hs. 11 5907 diacytgiycercl kinase; delta (130kD) 18 

332161 EOS32092 AA621523 Hs.165464 ESTs 18 

300942 EOS00873 AW275006 Hs.195959 ESTs 18 

300680 EOS00611 AW468066 Hs.257712 ESTs; Weakly similar to KIAA0985 protein [Usapiens] 18 
328783 EOS28714 c_7_hs gq5868309|refl gn 5 - 73658 73822 ex 2 5 CDSI 0.78 1 65 5371 

CH.07_hsg|5868309 18 

307542 EOS07473 AI280859 EST singteton (not tn UniGene) with exon hit 18 

331975 EOS31906 AA464972 Hs.99624 ESTs 18 

321532 EOS21463 T77886 Hs.83428 nuclear factor of kappa light polypeptide gene enhancer in B-cels 1 (p105) 18 

318721 EOS18652 Z28504 EST duster (not in UniGene) 18 

302124 EOS02055 AB023967 H§.145078 regulator of (fifferenKation (in S. pornbe) 1 IB 

323541 EOS23472 AI185116 Hs.104613 ESTs; Weakly similar to Similar to S.cerevisiae hypothetical protein L31 1 1 [H-sapiens] 18 

331057 EOS30988 N71399 Hs28143 ESTs 18 

316860 EOS16791 AW139099 Hs.127489 ESTs 18 

330601 EOS30532 U90916 Hs.82845 Human done 23815 mRNA sequence 18 

307334 EOS07265 AI214811 Hs.220615 ESTs; Weakly similar to TFIH protein [H.sapiens] 18 

323195 EOS23126 AI064982 Hs.117950 multifunctional polypeptide similar to SA1CAR synthetase and AIR carboxylase 1.8 

303856 EOS03787 AA968589 Hs.944 glucose phosphate isomerase 18 

321553 EOS21484 K92449 Hs.116406 ESTs 18 

332705 EOS32636 T59161 Hs.76293 thymosin; beta 10 18 
333139 EOS33070 CH22_368FGJ3_16_UKK_BAACO00O97.GENSCAN.67-19 

CH22_FGENES.83_16 18 
338997 EOS38928 CH22_7B81FG_UNK_DA59H18.GENSCAN.8-22 

CH22 DA59H18.GENSCAN.&.22 18 

301509 EOS01440 AI025435 Hs.117532 ESTs 18 

314522 EOS14453 AI732331 Hs.187750 ESTs;M<rieratelysmTaartoHH^^ 18 

303072 EOS03003 AF157833 EST duster (not in UniGene) with exon hit 18 

305271 EOS05202 AA679895 EST singleton (not in UniGene) with exon hit ■ 18 

335287 EOS35218 CH22_2629FG_526_1 1_UNK_EttAC005500.GENSCAN.42lM 

CH22_FGENES.526_11 18 

321286 EOS21217 AI38094O EST cluster (not in UniGene) 18 

318740 EOS18671 NM.0Q2543 EST duster (not in UniGene) 18 

323465 EOS23396 AA287406 EST duster (not In UniGene) 18 

300611 EOS00542 N75450 EST duster (not In UniGene) with exon hit IB 

306235 EOSQ6166 AA932299 EST singleton (not in UniGene) with exon hit 18 

336721 EOS36652 CH22_4244FG_83J7_ CH22_FGENES.B3-17 18 

311291 EOS11222 AA782601 Hs. 1226 84 ESTs 18 

310247 EOS10178 AI224982 Hs.211454 ESTs 18 

316564 EOS16495 AI743571 Hs.168799 ESTs; Weakly similar to till ALU SUBFAMILY J WARNING ENTRY HI! pisapiens] 1.8 
328170 EOS28101 c_6_hs gl J 5868071 iref] gn 1 + 931 70 93295 ex 9 9 CDS! 13.31 126 3591 

CH.06>gi|586B071 1.8 

300909 EOS00840 AW295479 Hs. 154903 ESTs; WeaHy similar to Aht substrate ena IDjnelanogaster) 18 

330869 EOS30800 AA115197 Hs.183702 ESTs 18 

311048 EOS10979 AA506952 Hs.210508 ESTs 18 
333764 EOS33695 CH22_1O31FG_271_3_UNK_EM^C0055OaGENSCAN.127-3 

CH22_FGENES.271_3 18 
338862 EOS38793 CH2*J715FG_UN^DJ32l10.GENSCAN.1-6 

CH22_W32l10.GENSCAN.1-6 18 

331467 EOS31398 N22206 Hs.43112 ESTs 18 
327742 EOS27673 c.5_hs gii5867944|relj gn 3- 143307 143512 ex 1 3 CDS1 11.07 206 172 
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CR05_hsgi|5857944 1.8 

320955 EOS208B6 ALD49415 Hs^04290 Homo sapiens mRNA;cDNADKFZp586N21 19 {from doneDKFZp586N2119) 1.8 

323589 EOS23520 AW39C054 Hs.192843 ESTs 1.8 

319961 EOS18882 AA307665 Hs.14559 ESTs 1.8 
5 333763 EOS33694 CH22_1030FG.271XUNK_EMJ\C005500.GENSCAN.127-2 

CH22LFGENES.27U 17 

331046 EOS30977 N68S63 HS.19135B ESTs 1.7 

320001 EOS19932 AA873350 EST cluster (not In UnlGene) 1.7 

316869 EOS16800 AI954880 Hs.134604 ESTs 1.7 

10 310774 EOS10705 AW134483 Ha. 164371 ESTs 1.7 

319379 EOS19310 TB1443 Hs.193963 ESTs 1.7. 

321549 EOS21480 AA4709B4 Hs.161947 ESTs 1.7 

300823 EOS00754 AI8630E8 Hs722665 ESTs; Weakly stmEar to pirtatlve zinc firmer protein NY-REN-34 antigen [H.sapiens] 1.7 

324228 EOS24159 AI796146 Hs.2O7780 ESTs v 1.7 

IS 313902 EOS13833 AI308165 Hs.1 56242 ESTs 1.7 

308928 EOS08859 AI863906 EST singleton (not in UnlGene) with axon hit 1.7 

333770 EOS33701 CH22_1037FG_272_1_UNK^AC005500.GENSCAN.127-10 

CHZLFGENES.272J 1.7 

316934 EOS16865 A1571647 Hs.146170 ESTs 17 

20 313219 EOS13150 N74924 Hs.182099 ESTs 1.7 

317360 EOS17291 AI125252 Hs.126419 ESTs 1.7 

303530 EOS03461 A1274851 Ha258744 ESTs 17 
334739 EOS34670 CH22_2061FG_424_14_UNK^EM;AC005500.GENSCAN^35-16 

CH22_FGENES.424.14 1.7 

25 337670 EOS37601 CH22_5996FG_UNK^EM:AC00QG97.GENSCM.57-2 

W22_EMAC000097.CBiSCAN.57-2 1.7 

312079 EOS12010 T79745 Hs.189717 ESTs 1.7 

320211 EOS20142 AL0394Q2 Hs.1 25783 DEME-6 protein 1.7 

' 316218 EOS16149 AW207642 Hs.174021 ESTs 1.7 

30 335682 EOS35613 CH22_3043FG_5S5XUNlCEfAAC00550aGENSCAN.487-1 1 

CH22_FGENE&595_2 1.7 

330696 EOS30627 AA022632 Hs.15825 ESTs 1.7 

314449 EOS14380 ALD42667 Hs.225539 ESTs 1.7 

311972 EOS11903 N51511 Hs.188449 ESTs 17 

35 307691 EOS07622 AI318285 Hs.182371 pmthymosin; alpha (gene sequence 28) 1.7 
338249 EOS38180 CH22_6847FG_UNieENtACX)0550aGENSCAN^69.1 

CH22JEMAC005500.GENSCANi69-1 1.7 
326399 EOS26330 c19>gii5867353)ref)gn 1 ■» 6385 6536 ex 66 0)S1 10.69 152 684 

CH.19 hsgI15867353 1.7 

40 313290 EOS13221 AI753247 Hs.206454 ESTs ~ 17 

301616 EOS01546 W39477 EST cluster (not In UnlGene) with exon hit 1.7 

307034 EOS06965 AI142526 EST singleton (not in UnlGene) with exon hit 1.7 

313577 EOS13508 AA565051 Hs.155029 ESTs 1.7 

324703 EOS24634 AB009282 Hs.31086 HonmaptensmRMfor(7tamrom^ 1.7 

45 321317 EOS21248 A193705O Hs.202040 ESTs; Weakly similar to KIAA0938 protein [KsapiensJ 1.7 

312278 EOS12209 AW205234 Hs.201587 ESTs 1.7 
333358 EOS33289 CHZL604FG 141.9 UNK.EMAC005500.GENSCAN^1-9 

CH22_FGENES.141_9 1.7 

322735 EOS22666 AA086123 EST cluster (not in UnlGene) 1.7 

50 326752 EOS26683 c20 hs gIJ5B67615|refl gn 1 - 1214 1562 ex 2 2 CDSf 33.07 349 

CH.20Jisgl|5867615 1.7 

314733 EOS14664 AW452355 Hs.256037 ESTs 1.7 

312902 EOS12833 AW292797 Hs.130316 ESTs 1.7 

322653 EOS22584 AI828854 Hs-171891 ESTs 1.7 

55 336015 EOS35946 CH22_3398FG.669_4JJNK_DJ32H0.GENSCAM9-9 

CH2^FGENES.669_4 1.7 

324500 EOS24431 AW269819 Hs.169905 ESTs 17 

310900 EOS10831 AI92272B Hs.165803 ESTs; Weakly similar to !lll ALU SUBFAMILY SB WARNING ENTRY III! lUsaptens] 1.7 
337908 EOS37839 CH2^6323FG_UNK3ftAC0055Q0.GENSCAN.57-1 

60 CH22.EMAC00550aGENSCAN.57-1 1.7 

304084 EOS04015 T91986 EST singleton (not in UnlGene) with exon hit 1.7 

332539 EOS32470 AA412528 Hs.20183 ESTs; Weakly similar to cDNA EST EMBLT01421 comes from this gene [Ceiegans] 1.7 

314332 EOS14263 ALQ37551 Hs.95612 ESTs 1.7 

321412 EOS21343 AW366305 EST cluster (not in UnlGene) 1.7 

65 312187 EOS12118 AA700439 Hs.188490 ESTs 1.7 

314147 EOS14078 A1656135 Hs.129805 ESTs 1.7 

303131 EOS03062 AWQ81061 Hs.103180 acMe6 1.7 

331341 EOS31272 AA303125 Hs.119009 ESTs; Weakly similar to IIIl ALU SUBFAMILY SB2 WARNING ENTRY llll [rlsapiens] 1.7 

313615 EOS13546 AW295194 Hs.25264 DKFZP434N126 protein 1.7 

70 329598 EOS29529 c10j>2gi|3962482|gb]Agn 4 + 39924 40220 ex 2 3 CDS! 8.71 297420 

CR10_p2gil3962482 17 

303579 EOS03510 AA381124 Hs.193353 ESTs; Weakly similar to Ull ALU SUBFAMILY J WARNING ENTRY llll Hsapiens] 1.7 

331692 EOS31623 W93592 Hs.47343 ESTs 17 

323977 EOS2390B AW326177 Hs.234713 ESTs 1.7 

75 332930 EOS32861 CH22.151FG_38_4.UNICC20H12.GENSCAN.294 

CH2*_FGENES.38_4 1.7 
326596 EOS26527 d 9> giJ61 38928|refjgn 4* 133386 133563 ex 7 9 COSI -1.32 178 3520 

Cai9_hsgil613892B 1.7 

314946 EOS14877 AI097229 Hs.217484 ESTs; Weakly similar to Jill ALU SUBFAMILY J WARNING ENTRY llll [rUaptens] 1.7 

80 315357 EOS15288 AAS08684 Hs.121705 ESTs; Moderately similar to Ull ALU CLASS C WARNING ENTRY llll [H.sapiens] 1.7 

324728 EOS24659 AA303024 EST cluster (not fn UnlGene) 1.7 

317501 EOS17432 AA931245 Hs.t37097 ESTs 1.7 

332219 EOS32150 N22508 Hs.139315 ESTs 1.7 
335369 EOS35300 CH22_2718FG_543J_UNK_EMAC005500.GENSCAN.432-9 

85 CH22_FGENES.543_7 17 

322417 EOS22348 W3628S Hs.171873 ESTs; Weakly similar to PUTATIVE STEROID DEHYDROGENASE WK-I [MjthiscuIusI 1.7 
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316100 


EOS16031 


AW203986 Hs.213003 ESTs 


1.7 


314866 


EOS14797 


AW305124 Hs.191682 ESTs 


1.7 


300328 


EOS00259 


AW015860 Hs.224623 ESTs 


1.7 


315676 


EOS15607 


AW0Q2565 Hs.136590 ESTs 


1.7 


314183 


EOS14114 


AA748600 EST duster (not tn UnlGene) 


1.7 


321354 


EOS21285 


AA078493 EST cluster (not In UniGene) 


1.7 


311904 


EOS11835 


T86907 Hs.119371 ESTs 


1.7 


322890 


EOS22821 


AA082030 EST duster (not in UnlGene) 


1.7 


302759 


EOS02690 


AIB85815 Hs.184727 ESTs 


1.7 


324600 


EOS24531 


AA503297 Hs.1 17108 ESTs 


1.7 


314973 


EOS14904 


AW273128 Hs.254669 EST 


1.7 


324432 


EOS24363 


AA464510 ESTchister(notinUnlGene) 


1.7 


331520 


EOS31451 


N49068 Hs.93966 ESTs 


1.7 


308380 


EOS08311 


AI623988 EST singleton (not in UniGene) with axon hit 


1.7 


331010 


EOS30941 


H95039 Hs.32188 KIAA0442 protein 


1.7 


325363 


EOS25294 


c12_hs gl]5866920|rcfj gn 7 + 700446 700516 ex 6 8 CDSI -6^8 71 1 13 








Oi12»hsgll5866920 


1.7 


310470 


EOS10401 


AI281846 Hs.165547 ESTs 


1.7 


330711 


EOS30642 


AA164687 Hs. 177576 mannosyi (alpha-1 ^glycoprotein beta-1;4-N-acetyl^iicosarnirTyitranstBrase; Isoenzyme A 


1.7 


332074 


EOS32005 


AA59S012 HS22826 ESTs 


1.7 


309732 


EOS09663 


AW262211 Rs.5662 guanine nucleotide binding protein [G protein); beta polypeptide 2-like 1 


1.6 


306337 


EOS06268 


AA954221 Hs.73742 ribosomal protein; large; P0 


1.6 


335169 


EOS35120 


CH22_2525FG_507J_UNK_EM^C005500.G£NSCAri40(U 








CH22_FGENES.507_4 


1.6 


316253 


EOS16184 
EOS32839 


A1919537 Hs.118056 ESTs 


1.6 


332908 


CH22_129FG_36 1UiNK_C20H12.GENSCAN.2B4 








CH22.FGENES.36J2 


1.6 


310002 


EOS09933 


AI439096 Hs^5832 ESTs 


1.6 


332258 


EOS32189 


NG8670 Hs.103806 " ESTs; WeaWy simflar to RanBPM [Rsapiens] 


1.6 


336182 


EOS36113 


CH22_3576FG_715XUNHLDA59H1B.GENSCAN.19^ 








CH22_FGENES.715_2 


1.6 


328987 


EOS2891B 


cJJis gi|5868535|refl gn 1 - 25705 25764 ex 3 10 CDSi a90 60 438 








Ca09_hsgl|5B68535 


1.6 


324481 


EOS24412 


AI916284 Hs.199671 ESTs 


1.6 


331406 


EOS31337 


AA610064 Hs.23440 WAA1 105 protein 


1.6 


332280 


EOS32211 


R38100 Hs.106294 ESTs 


1.6 


332173 


EOS32104 


F09281 Hs.90424 ESTs • 


1.6 


335739 


EOS35670 


CH22^3102FGL601JO_UNK.E1*AC005500.GENSCAN.49V10 








CH22_FGENES.601_10 


1.6 


332104 


EOS32035 


AA609177 Fte.109363 ESTs 


1.6 


315033 


EOS14964 


AI493046 Hs.146133 ESTs 


1.6 


334740 


EOS34671 


O<22J052FG.424J5.UNieEM^C0u^00.GENSCANmi7 








CH22.FGENES^24_15 


1.6 


334783 


EOS34714 


CH22j2095FG_43^8_UNICEMAC00550aGENSCAfU293-11 








CH22^FGENES.432_8 


1.6 


308010 


EOS07941 


AM39190 Hs.181165 eukaryolfc iranslafion elongation factor 1 alpha 1 


1.6 


304521 


EOS04452 


AA464716 EST singleton (not in UniGene) with exon hit 


1.6 


316719 


EOS18650 


225900 Hs.18724 Homo sapiens mRNA; cDNA DKFZp564F093 {from done DKFZp564F093) 


1.6 


321920 


EOS21851 


N63915 EST duster (not in UnlGene) 


1.6 


315019 


EOS14950 


AA532807 Hs.105822 ESTs 


1.6 


320793 


EOS20724 


AL049980 Hs. 18421 6 DKFZP564C1 52 protein 


1.6 


305371 


EOS05302 


AA714180 EST singleton (not in UniGene) with exon hR 


1.6 


305054 


EOS04985 


AA634127 Hs.182426 ribosomal protein S2 


1.6 


314643 


EOS14574 


A1587502 Hs.192088 ESTs 


1.6 


308166 


EOS08117 


AI537940 EST singleton (not in UniGene) with exon hit 


1.6 


31S371 


EOS19302 


R00321 Hs.174928 ESTs 


1.6 


331700 


EOS31631 


Z40011 Hs.180582 ESTs 


1.6 


316955 


EOS16B86 


AW203959 Hs.149532 ESTs 


1.6 


314961 


EOS14892 


AW008061 Hs.231994 ESTs 


1.6 


336676 


EOS36607 


CH22J154FG_43_4_ CH22_FGENES.4*4 


1.6 


322801 


EOS22732 


AI831910 Hs.163734 ESTs 


1.6 


303363 


EOS03294 


AI984095 Hs.226801 ESTs; Weakly similar to DlA-156 protein [Haptens] 


1.6 


328105 


EOS28036 


cJJisgi|5B68020[refl9n11-301705 301784ex47COSi 5.30 803147 








CH.06.hsgi|5868020 


1.6 


325481 


EOS25412 


c12Jis gi|5866957|re1| 9« 3 «■ 47590 47672 ex 4 7 CDSi 2.69 83 1895 








Ctt12Jtsgi|5866957 


1.6 


315361 


EOS152S2 


A1335229 Hs.122031 ESTs 


1.6 


324902 


EOS24833 


D31323 Hs-211188 ESTs 


1.6 


336016 


EOS35949 


CH22_3401 FG_669_7JJNKJU32I10.GENSCAN.9-12 








CH22_FGENES.669_7 


1.6 


308747 


EOS08678 


AJ8045Q0 Hs.181165 eukaryotic translafion etorrgafion factor 1 alpha 1 


1.6 


328251 


EOS28182 


c_6_hsgiI6381891Mgn4+124444 124557 ex23CDSi 0.401144554 








OLQUis 0(6381891 


1.6 


303153 


EOS03084 


U09759 Hs.8325 mitogen-activated protein kinase 9 


1.6 


327809 


EOS27740 


c_5_hs gi)5867968|refl gn 3 + 54610 64761 ex 4 4 CDSI 0.78 152 993 








CH.05_hsgi|5867968 


1.6 


314107 


EOS14038 


AA806113 Hs.189025 ESTs 


1.6 


300304 


EOSQQ235 


AJ637934 Hs.224978 ESTs 


1.6 


313009 


EOS12940 


W52010 Hs.191379 ESTs 


1.6 


331074 


EOS31005 


R08440 yf19f9^1 Scares fetal Over spleen 1NFLS Homo sapiens cDNAdone IMAGE 127337 3 1 similar b 








contains Atu repetitive element;, mRNA sequence 


1.6 


335773 


EOS35704 


CH2^3142FG_607_9_UNK^EMAC005500.GENSCAR49M 








CH22_FGENES.607_9 


1.6 


334991 


EOS34922 


CH22_2312FG_469_1 1JJNK_EMAC005500.GENSCANJ65-11 








CH22_FGENES.469_11 


1.6 


322959 


EOS22890 


AI267606 EST cluster (notin UniGene) 


1.6 
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323731 EOS23662 AA323414 EST duster (not In UnlGene) 1.6 

331073 EOS31004 R07998 Hs.18628 ESTs; Weakly similar to till ALU SUBFAMILY J WARNING ENTRY III! [H^aplens] 1.6 

313573 EOS13504 A1076259 Hs.190337 ESTs 1.6 

316949 EOS16880 AA856749 Hs.124620 ESTs 1.6 
328064 EOS28015 c_6Jis gi|6469819|refj gn 3 -155366 155459 ex 1 4 CDSI 1.23942982 

CH.0a.hsglI6469819 1.6 

331526 EOS31457 N49967 Hs.46624 ESTs 1.6 

317987 EOS17918 AW138174 Hs.130651 ESTs 1.6 
325594 EOS25525 c13.hs gi|5B66992)ret] gn 4 -470474 470566 ex 2 3 CDS! 8.099368 

CR13_hsgi]5366992 1.6 

31Q848 EOS10779 AI459554 Hs.161286 ESTs 1.6 

309268 EOS09199 AI985821 Hs.62954 ferritin; heavy polypeptide 1 1.6 

304518 EOS04449 AA461438 EST singleton (not in UniGene) with exon hit 16 

331065 EOS30996 N90584 Hs.9167 Homo sapiens done 25035 mRNA sequence 1.6 

306501 EOS0S432 AA987294 EST singleton (not In UnlGene) with exon hit 1.6 

323289 EOS23220 AL134235 Hs.222442 ESTs 1.6 
334630 EOS34561 CH2^1938FG_416_6_UNK_EMAC00550aGENSCAN.277>6 

CH22_FGENES,416_6 1.6 

302025 EOS01956 AKJ91466 Hs. 127241 DKFZP564F052 protein 1.6 
328998 EOS28929 c_9_hs gi|5868538Nl gn 1+40996 41 104 ex 1 3 CDSf 11.00 109 480 

CH.09_hs gi[5868538 16 

313197 EOS13128 AI738851 Hs_E2487 ESTs 1.6 
338763 EOS38694 CH2^7585FGL_UI^EMAC005500.GENSCAN.517-16 

CH2^MAC005500.GENSCAH517-16 16 

332247 EOS32178 N58172 Hs.109370 ESTs 1.6 

316724 EOS16655 AA810788 Hs.123337 ESTs 16 

303306 EOS03237 AA215297 EST cluster (not in UnlGene) with exon hit 16 

306336 EOS06267 AA954198 EST singleton (not in UniGene) with exon hit 16 

308256 EOS06187 AI565498 EST singleton (not in UniGene) with exon hit 16 

307056 EOS06987 AI14B675 EST singleton (not in UniGene) with exon hit 1.6 

321370 EOS21301 AJ227900 EST cluster (not In UnlGene) 1.6 
336262 EOS36193 CH22.3661 FG_754_9_UNKJ)A59H1 8.GENSCAN.57-1 1 

CH22^FGENES.754_9 16 
335497 EOS35428 CH22J849FG^571_5_UNK^tACOO5a)0.GENSCAN.480-26 

CH22_FGENES.571_6 16 

309562 EOS09513 AW169657 EST singleton (not in UnlGene) with exon hit 16 
329563 EOS29494 c10_p2 gi|3962490|gbjA gn 1 - 410 635 ex 2 2 CDSf 13.80 226 267 

CH.10_p2gf|3962490 16 

332504 EOS32435 AA053917 Hs.15106 chromosome 14 open reading frame t 1.6 

308090 EOS08021 AI474501 Hs.2186 eukaryotic translation elongation factor 1 gamma 16 

331752 EOS31683 AA287312 Hs.191648 ESTs 1.6 

330681 EOS30812 AA132988 Hs.69321 ESTs; WeaWy similar to SimHiar to mucin and several other Ser-Thr-rich proteins [Scerevisiae] 16 

315647 EOS15578 AA648983 Hs^12911 ESTs 16 

336766 EOS36697 CH2?_4341FGJ43_20. CH2^.FGENES.143-20 16 

302592 EOS02523 AA294921 Hs.250811 v-ra) simian leukemia viral oncogene homotog B (ras related; GTP binding protein) 16 

315076 EOS15007 A1623817 Hs.168457 ESTs 16 

337056 EOS36987 CH22_4946FG_44U_ CH22.FGENES.441-4 16 

322175 EOS22106 AF085975 EST cluster (not in UniGene) 16 

336833 EOS36764 CH_2_4504FG_242_2_ 0-_LFGENES._42-2 16 
334902 EOS34833 CH_2_2219FG 452_16_UNK.EMAC00550aGENSCAN.341 -19 

CH22_FGENES.452_16 16 

318671 EOS18602 AA188B23 Hs.212621 ESTs 16 

308064 EOS07995 AI469273 Hs.181165 eukaryotjc translation elongation factor 1 alpha 1 16 

320559 EOS20490 ABQ21981 Ha 159322 solute carrier family 35 (UDP^-ace^ucosamtne (UDP-GtcNAc) transporter); member 3 16 

317881 EOS17812 AI827248 Hs.224398 ESTs 16 

313078 EOS13009 N49730 EST cluster (not In UniGene) 16 
338689 BOS3862Q CH22jMfG_mK.EteAC0055OQ.GB4SCMlA7$3 

CH22.EfvtAC005500.C_ENSCAN.47W 16 

311604 EOS11735 AA135159 Hs.203349 ESTs 16 

316359 EOS16290 AI472213 Hs.123415 ESTs 1.6 
330182 EOS30113 c_4j>2 gi|5123954Jemb| gn 4 +120156 120245 ex 2 2 COS1 4.699011 

CH.04j_>gi|5123954 16 
334718 EOS34649 CH22_2O28FG_421.29„UNK^KtAC«^0.GENSCAN.282-29 

CH22_FGENES.421_29 16 

324196 EOS24127 AA405524 Hs.178000 ESTs 16 

305350 EOS05281 AA708676 EST singleton (notin UniGene) with exon hit 16 

331469 EOS314M N22273 Hs.39t40 ESTs . 16 

305715 EOS05646 AA826884 EST singleton (not in UniGene) with exon hit 16 

314460 EOS14391 AI263231 Hs.145607 ESTs 16 

317634 EOS17565 AA953088 Hs.127550 ESTs 16 
335293 EOS35224 CH22^635FG_527_6_UN^ENtAC005500.GENSCAN.421-9 

CH2^FGENES.527„6 1.6 

305611 EOS05542 AA782331 EST singleton (not in UniGene) with exon hit 16 

310430 EOS10361 AI670843 Hs.200257 ESTs 16 

323696 EOS23627 AA841201 Hs.222051 ESTs 16 

300610 EOS00541 N72596 Hs.99120 DEAD/H (Asp-GJu-AlaAsp/His) box polypeptide; Y criromosome 16 
327364 EOS27295 c_1jis gil6552412Irof| gn 2 - 1 15235 115396 ex 1 9 CDSI Z77 1623007 

Ca01Jsgil6552412 16 

324848 EOS24779 AW021857 EST cluster (not In UniGene) 16 

321491 EOS21422 H70665 Hs.183960 ESTs 1.6 
336367 EOS3529B CH22_3779FG_818J1_UNK_BA232E17.GENSCAN.3.17 

CH22.FGENES.818J1 1,6 

331549 EOS31480 N56866 Hs.237507 EST 1.6 
328332 EOS28263 c_7Jis g]|5868375|n3fj ga 6 f 280154 2BD239 ex 3 5 COS4 -1.04 13S 516 

CH.07._hsgi|5858375 15 

322817 EOS2274B C02420 EST cluster (not In UnlGene) 1.5 
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303383 EOS03914 AW514111 Hs.181165 eukaiyotfc transiafion elongation factor 1 alpha 1 1.5 
329434 EOS29365 u> gf|5B68883|ref| gn 1-31124 31263 ex 3 20 COSI 6.38140241 

CRYJsgij58S8B83 1.5 
33B196 EOS38127 CH22.6763FG_UNK^EM^C005aX).GENSCAN.235-16 

5 CH22.EMAC00550aGENSCAN.235.16 U 

308468 EOS08419 Af682148 Hs.179661 Homo sapiens clone 24703 beta^ubuBn mRNA; complete cds t.5 

314883 EOS14814 AW178807 Hs.246182 ESTs 1.5 

307095 EOS07026 AI167910 EST singleton (not in UnlGene) with exon hit 1.5 

306953 EOS06884 AI124971 EST singleton (not in UnlGene) with exon hit 1.5 

10 331786 EOS31717 AA398539 Hs.97359 EST 1.5 

303509 EOS03440 AW378236 Hs.256050 ESTs 1.5 

324515 EOS24446 AW501686 Hs.163539 ESTs 1.5 
339323 EOS39254 CH22.6284FG_LINK.BA354l12.GENSCAN.23-2 
CH2^BA354I12.GENSCAN.23.2 1.5 

15 306563 EOS06494 AA995296 EST singleton (not in UnlGene) with exon hit 1.5 

316076 EOS16007 AW297895 Hs.116424 ESTs 1.5 
325622 EOS25553 c14Jis gi|5867000|n3fl gn 2 + 69994 70075 ex 6 8 CDSi 9.4082194 

Ol14Jisgi|5867000 1.5 

309832 EOS09563 AW193261 Hs.156110 Immunoglobulin kappa variable 1 M 1.5 

20 314926 EOS14857 AI380838 H&124835 ESTs 1.5 

314456 EOS14389 A1217440 Ks.143673 ESTs 1.5 
335219 EOS35150 CH22_2558FGJ13JLUNK.EMAC005500.GENSCAN.406-2 

CH22_FGENES.513_2 1.5 

301079 EOS01010 AA305047 Hs. 183654 ESTs; WeaWy similar to unknown [S-cerevislae] 1.5 
25 334122 EOS34053 CH22_14C0FG_333J_UKK_EMAC005500.GENSCAN.185-27 

CH22_FGENES.333J 1.5 

308139 EOSQB070 AI494477 EST singleton (not in UnlGene) with exon hit 1.5 

317412 EOS17343 AI301528 Hs.132604 ESTs 1.5 

315073 EOS15004 AW452948 Hs.257631 ESTs 1.5 

30 313139 EOS13070 AA362113 EST cluster (not In UniGene) 1.5 

307012 EOS06943 Af140212 EST singleton (not in UniGene) with exon hit 1.5 

322895 EOS22826 AW470295 Hs.192152 ESTs 1.5 

303779 EOS03710 AA897296 Hs.221266 ESTs 1.5 

312344 EOS12275 AI74261B Hs.181733 ESTs; Weakly similar to nitrilase homotog 1 [H^apiens] 1.5 

35 323632 EOS23563 AL039950 EST cluster (not in UniGene) 1.5 

332336 EOS32267 T96130 Hs.137551 ESTs 1.5 

304547 EOS04478 AA486169 EST singleton (not In UnlGene) with exon hit 1.5 

335692 EOS35623 CH22_3053FG_596^UN»CBtAC00550aGENSCAN.488-7 

CH22_FGENESi96_7 1.5 
40 328333 EOS28264 c_7_hs gl]586B375]rerl gn 6 + 282506 282664 ex4 5 CDSI 7.71 159 517 

CH.07_hs gq5868375 1.5 

304143 EOS04074 RB8737 EST singleton (not in UnlGene) with exon hit 1.5 

329625 EOS29556 c11j>2gi|4567169jgb|Agn 2 - 85893 B5984ex3 5 CDSI 2.249229 

CH.11_p2gi|4567169 1.5 
45 329960 EOS29891 c16_p2giJ5091594|gb|Agn 1 -1031 1162 ex 1 3 CDS1 10.75 132 415 

CH.16jj2 gi]5091594 1.5 

318975 EOS18906 Z44110 EST duster (not in UniGene) 1.5 

321875 EOS21806 N49122 EST cluster (not in UniGene) 1.5 

320451 EOS20382 R26944 Hs.180777 Homo sapiens mRNA; cDNA DKFZp564W0264 (from clone QKFZP5SWM64) 1.5 
50 336020 EOS35951 O^.3403FG_669XUNKJ3J32l10.GENSCAN.9-14 

CH22.FGENES.669.9 1.5 

332581 EOS32512 T28799 Hs313 EphB3 1.5 
338622 EOS38553 (^22_7384FG_UNK^EfAAC005500.GENSCAN.451-1 

CH2£_EM^C005500.GENSCAN.451-1 1.5 

55 330397 EOS3D328 D14659 Hs. 154387 WAA0103gena product 1.5 

314359 EOS14290 AA205569 Hs.194193 ESTs 1.5 

313456 EOS13387 AW380579 Hs^09S57 ESTs 1.5 

318486 EOS18417 H09123 Hs.13925B ESTs 1.5 

316175 EOS1B106 AA644624 EST duster (not in UniGene) 1.5 

60 335684 EOS35615 CH22_3045FGJ95_4_UNK_EMJ\COrj5500.GENSCAN.487-13 

CH22.FGENESJ95 4 1.5 
327814 EOS27745 c_5jis gi|5867968|refjgn 6 +69377 70566 ex 1 2 CDS! 86.15 1190 999 

OL05 hsgiJ586796B 1.5 

322120 EOS22051 W84351 Hs^ 13846 ESTs" 1.5 

65 311749 EOS11680 R06249 Hs.13911 ESTs 1.5 
329797 EOS29728 c14_p2 gi]6523160|emb} gn 1 - 10616 10894 ex 3 6 CDSI 5.86 279 1549 

CH.14_p2gfI6523160 1.5 

330630 EOS30561 X78669 Hs.79088 reficutocabin Z EF«hand catehim binding domain 1.5 

303777 EOS03708 AA346491 EST duster (not in UniGene) with exon hit 1.5 

70 309656 EOS09587 AW197060 Hs.195188 gtyceraldehyde^-phosphate dehydrogenase 1.5 
326165 EOS26096 c17Jis gil5867208|re1j gn 2 - 62787 62929 ex 1 10 CDSi 0.87 143 2037 

CK17_hsgI|5867208 1.5 

308328 EOS08259 AI590571 Ks.186412 EST 1.5 

„ 300601 EOS00532 AI762130 Hs.165619 ESTs 1.5 

75 303610 EOS03541 AA323288 EST duster (not in UnlGene) with exon hit 1.5 

307856 EOS07787 AI386158 EST singleton (not in UniGene) with exon hit 1.5 

319920 EOS19851 R54575 Hs.13337 ESTs; Weakly similar to similar to Phosphoglucomutase and phosphomannomutase 

phosphoserine[C.e)egans] 1.5 

332167 EOS32098 D57389 Hs.75447 ralA binding protein 1 1.5 

80 316427 EOS16358 AI241019 Hs.145644 ESTs 1.5 

303886 EOS03817 AW365963 EST duster (not in UniGene) with exon hit 1.5 

314292 EOS14223 AA732590 Hs.134740 ESTs 1.5 

315408 EOS15339 AW273261 Hs.216292 ESTs 1.5 
ft _ 335698 EOS35629 CH22„3059FG_597J_UNK^EhtAC005500.GENSCAR489-1 

85 CH22_FGENES.597_1 1.5 

315084 EOS15015 AI821085 Hs.187796 ESTs 1.5 
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302299 


EOS02230 


R64632 Hs.1 B21 67 hemoglobin; gamma A 


1.5 


306803 


EOS06734 


AI055860 Hs.193717 interteuldn 10 


1.5 


315802 


E0S15733 


AA677540 Hs.1 17084 ESTs 


1.5 


326257 


EOS26188 


c17JififllJ5867264Jreflgn6+222712222819ex22COSI 4.461083597 








CH17.hsgiJ5867264 


1.5 


319599 


EOS19530 


H561 12 EST cluster (notln UniGene) 


1.5 


321891 


E0S21822 


AW167424 Hs.165954 ESTs 


1.5 


335164 


EOS35095 


CH22^250OFG_5O2_8_UNK^EM^CO05500.GENSCAM396-23 




327133 




CH22J r GENES.502_8 


1.5 


EOS27064 


c21Jis g!|6682522)ref| gn 1 + 38069 3B938 ex 2 2 CDS1 63.42 870 1583 




317460 




CH.2Lhsgi|6682522 


1.5 


EOS17391 


AA926980 Hs.131347 ESTs 


1.5 


332344 


EOS32275 


W45574 Hs_2S2497 ESTs . 


1.6 


328801 


EOS2B732 


c_7JiS0il5868321|ref|gn1. 44492 44609 ex 2 3 CDSi 1.71 1185525 








CR07_hsgi|5B68321 


1.5 


321677 


EOS21608 


N44545 Hs-51865 ESTs 


1.5 


331858 


EOS31789 


AA421163 Hs.163848 ESTs 


1.5 


309243 


EOS09174 


AI972052 EST singleton (not In UnlGene) with exon hit 


1.5 


326213 


EOS26144 


c17Jts gq5867224|raf] gn 3 - 60751 60927 ex 1 4 COS! 2.06 177 2687 








CR17jsgij5867224 


1.5 


321632 


EOS21563 


AA419617 EST duster (not in UnlGene) 


1.5 


321424 


EOS21355 


AA057301 EST duster (not In UniGene) 


1.5 


322465 


EOS22396 


AA137152 Hs.3784 ESTs; Highly similar to phosphoserira aminoiransferase IRsapJensJ 


1.5 


333391 


EOS33322 


CH22J37FGJ 44_6_UN}^EM:ACO055Q0.GENSCAN.2&6 








CH22_FGENES.144_6 


1.5 


333384 


EOS33315 


CH2?_630FGJ43_23_UNK_EMAC00550a(^NSCAN.24.17 . 








CH22JGENES.143_23 


1.5 


334784 


EOS34715 


C^2096FG_432XUNK_FJ^C005500.GENSCAN.293-12 








CH22_FGENES.432_9 


1.5 


334078 


EOS34009 


CH2^1356FGJ27.33.UNK_EMAC00550aGENSCAN.181-35 








CH22_FGENES.327_33 


1.5 


335158 


EOS35C89 


CH2^.2494FGJ02XUNK.EMAC00550aGENSCAN.396-17 








CH22_FGENES.5Q^2 


1.5 


335062 


EOS34993 


CH2a^388FG_4&?_17_UNieENfcAC005500.GENSCAN.376.16 








CH22JG_NE$.482_17 


1.5 


333243 


EOS33174 


CH22J82FGJ 1 L7_UNK_EIVtAC000097.GENSCAN.1 20^ 








CH22.FGENES.1H_7 




306380 


EOS06311 


AA968861 EST singleton (not In UnlGene) with exon hit 


1.5 


320809 


EOS20740 


A1540299 ESTduster (not in UniGene) 


1.5 


332813 


EOS32744 


CH22_29FG_8J_UNK_C65E1.GENSCAfl2-2 








CH22JGENES.8J 


1.5 


335817 


EOS3574B 


CH22J189FGJ18J_UMieEMAC005500.GENSCAN.51W 








CH22JGENES.618J 


1.5 


319551 


EOS19482 


AA761668 EST cluster (not in UnlGene) 


1.5 


334472 


EOS34403 


CH22J771 FG_394_3_UNK.EMAC005500.GENSCANi57-3 








CH22JGENES.394J 


1.5 


333029 


EOS32960 


CH22_255FGJ8_3_UNK_EIytAC000O97.G_NSCA>J.4O^ 








CH22J r GENES.68J 


1.5 


308055 


EOS079B6 


AI46B091 Hs.1 19252 tumor protein; trartstalionalry-controaed 1 


1.5 


302882 


EOS02813 


AW403330 EST duster (not in UniGene) with exon hit 


1.5 


314033 


EOS13964 


AA167125 EST cluster (rot in UniGene) 


1.5 


324928 


EOS24859 


A1932285 Hs.160569 ESTs 


1.5 


329524 


EOS29455 


c10_p2 gll3983507|gb|A gn 6 - 38025 38143 ex 3 3 CDSi 2,40 119 170 








CH10_p2 ($3983507 


1.5 


333131 


EOS33062 


CH22_360FG_83_6_LINK.EJAACOOD097. GENSCAM67-1 0 








CH22_FGENES.83J 


1.5 


332085 


EOS32016 


AA600353 Hs.173933 ESTs; Weakly s&nflar to NUCLEAR FACTOR 1/X [H^apiens] 


1.5 


305369 


EOS05300 


AA714040 EST singleton (not in UniGene) with exon hit 


1.5 


300344 


EOS00275 


AW291487 rfcv213659 ESTs 


1.5 


325071 


EOS2S002 


H09S93 EST duster (not in UniGene) 


1.5 


323693 


EOS23624 


AW297758 Hs.249721 ESTs 


1.5 


321899 


EOS21830 


N55158 Hs.1 35252 ESTs 


1.5 


331857 


EOS31788 


AA421 1 60 Hs.9456 SWUSNF related; matrix associated; adin dependent regulator of chromatin; subfamily a; member 5 


1.5 


334850 


E0S34781 


(>l22_2164FG_439J6_Ur^EM^C(KJ5500.GENSCANJ11.13 








CH22_FGENES.439_36 


1.5 


322610 


EOS22541 


AF180919 ESTduster(notinUniGene) 


1.5 


335332 


EOS35263 


CH22_2677FG_535J_UMK^EhtAC005500.GENSCAN.426^ 








CH22_FGENES£35_6 


1.5 


307565 


EOS07496 


AI282466 EST singteton (not In UniGene) with exon hit 


1.5 


314140 


EOS14071 


AI216473 Hs.154297 ESTs 


1.5 


323011 


EOS22942 


AA5B0288 EST duster (not in UniGene) 


1.5 


325366 


EOS25297 


c12_hs pjl5866920|retl gn 9 - 920962 921713 ex 1 8 CDS1 15.95 752 167 








CH.12_hsg!J5866920 


1.5 


322306 


EOS22237 


W75935 Hs.1 46083 ESTs 


1.5 


311034 


EOS10965 


AI564023 Hs.171467 ESTs; Highly similar to NKG2-D TYPE II INTEGRAL MEMBRANE PROTHN IH.saplens] 


1.5 


305081 


EOS05012 


AA641638 EST singleton (not In UniGene) with exon hit 


1.5 


322933 


EOS22B64 


AA099759 EST duster (not in UnlGene) 


1.5 


335221 


EOS35152 


CH22_2560FGJ13J_UNieEM^C005500.GENSCAN.40&4 








CH22_FGENES^13_4 


1.5 


304948 


EOS04879 


AA613107 EST singleton (not in UniGene) with exon hit 


15 


334900 


E0S34831 


CH22_2217FG_452J4_UNKJMA(^ 








CH22JGENES.452J4 


1.5 


318404 


E0S18335 


AI654108 Hs.135125 ESTs 


1.5 


339358 


EOS39289 


CH22_8328FG_UNKJA354I12.GENSCAN.3W 








CH22JA354I12.GENSCAN.31-3 


1.5 


327074 


EOS27005 


c21Jis gil6531965|refl gn 58 ♦ 4039993 4040096 ex 3 4 CDS) 0.68 104 1284 
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CH.21_hs gi]8531965 1.5 
326054 EOS25985 cl7_hsgl|58671B4jieqgn 2 -146342 146469 ex 3 4 CDSi 10.00 128 426 

CH.17_hsgl|5867184 1.5 
326892 EOS26823 c20Jis gl|668251 1 |ref] gn 5 + 1 19424 1 19500 ax 29 30 CDS! 1 8.89 77 2313 
5 CR20Jtsgi|6682511 1-6 

326767 EOS2B698 c_7 Jisgl|601 7031 Ngn 1-35625 35723 ex 44 COSf 5.6399 5262 

CH.07Jisgi|6017031 1.5 
337772 EOS37703 CH22_6125F6_UNieBtAC000097.GENSCAN.119-11 

CH22„EM^C000097.GENSCAN.119-11 1.5 
10 312199 EOS12130 AW43B602 Hs.191179 ESTs 1.5 
303506 EOS03437 AA340605 Hs.105887 ESTs U 
325176 EOS2S107 T52843 EST cluster (not In UniGene) 1.5 

302023 EOS01954 AF060567 Hs.126782 sushi-repeat protein 1.5 
305833 EOS05764 AA857836 Hs.161165 eukaryotic translafion eiongaton factor 1 alpha 1 • 1.5 
15 309131 EOS0S062 AI929175 Hs.1 19122 ribosomal protein L1 3a 1.5 
334184 EOS34115 CH22_1465F(^350J5_UNK_EM^C00550aGENSCAN209-17 

CH22.FGENES.350J5 1.5 
335188 EOS35119 CH22_2524FG_507_3_UNK^E!rtAC00550D.GENSCAN.40(W 

> CH22_FGENES.507_3 - 1.5 

20 304813 EOS04744 AA584540 EST singleton (not in UniGene) with exon hit 1.5 

315359 EOS15290 AA608808 Hs.225118 ESTs 1.5 

324434 EOS24365 AA707249 Hs.98789 ESTs 1.5 
327910 EOS27841 c_6_hs gi|5868162|reflgn 1+21622 21 748 ex 6 7 CDSI 3.69127449 

CH.06 hsgi|5B68162 1.4 
25 335571 EOS35602 C^3031FGJ92.3^UNK.EM^C005500.GENSCAN.4854 

CH22_FGENES.592_3 1.4 
334943 EOS34874 CH22_2264FG_465XUNK_EMAC005500.GENSCAN.35W 

CH22_FGENES.465_8 1.4 
326393 EOS26324 c19Jis gl|5867341|re1| Qn 2 +41702 41841 ex 55 CDSi 20.15 140 604 

30 CR19_hs $5867341 1.4 

305296 EOSQ5227 AA687181 EST singleton (not in UniGene) with exon hit 1.4 

307243 EOS07174 Al 199957 EST singleton (not in UniGene) with exon hit 1.4 

320066 EOS19997 AW364885 Hs.112442 ESTs 1.4. 

311465 EOS11396 AJ75B660 rfe>206132 ESTs 1.4 

35 302822 EOSQ2753 AW404176 Hs.11 1611 ribosomal protein L27 1.4 

304987 EOS04918 AA618044 EST singleton (not in UniGene) with exon hit 1.4 

330892 EOS30823 AA149579 Hs.118258 ESTs 1.4 
333385 EOS33316 CH22_631FG_143_24_UNK^EMAC005500.GENSGAN.24-18 

CH22_FGENES.143.24 1.4 

40 302626 EOS02557 AB021870 EST duster (not in UniGene) with exon hit 1.4 

318042 EOS17973 AW294522 Hs.149991 ESTs 1.4 
339361 EOS39292 CH22_8331FG__UNK_BA354HZGENSCAN.32-3 

CH22_BA354t1ZGENSCAN.32-3 1.4 

. _ 309000 EOS08931 AL88Q489 EST singleton (not in UniGene) with exon hit 1.4 

45 306004 EOS05935 AA889992 EST singleton (not in UniGene) with exon hit 1.4 

329539 EOS29470 d 0_p2 gl)3983503{gb|U gn 1 - 1 326 ex 1 3 COS! 41.66 326 212 

CH. 1 0_p2 gi|3983503 1 .4 

313663 EOS13594 AI953261 Hs.169813 ESTs 1.4 

323538 EOS23469 AW247696 EST duster (not in UniGene) 1.4 

50 337595 EOS37526 CH22_5884FG__LINreC20H12GENSCAN.B-1 

CH22_C20H12.GENSCAN.8-1 1.4 

303149 EOS03080 AA312995 EST duster (not In UniGene) with exon hit 1.4 

308484 EOS08415 AI679292 EST singleton (not in UniGene) with exon hit 1.4 

__ 300912 EOSQ0843 AW138724 Hs.168974 ESTs 1.4 

55 315158 EOS15089 AA744438 Hs.142476 ESTs; Weakly similaT to All ALU CLASS D WARNING ENTRY 111! [H^apiens] 1.4 

300462 EOS00393 AA746501 Hs.14217 ESTs 1.4 

312730 EOS12661 AI804372 Hs,208661 ESTs 1.4 

316866 EOS16799 AI660898 Hs.195602 ESTs 1.4 
337629 EOS37560 CH22^5933FG_UN1CC20H1ZGENSCAN^35 

60 CH22_C20H12.GENSCAN.2835 1.4 

332518 EOS32449 D16562 Hs. 155433 ATP synthase; H+ transporting; mitochondrial F1 complex; gamma polypeptide 1 1.4 

337422 EOS37353 CH22_5624FG_760JL CH22_FGENES.760-2 1.4 
328835 EOS28766 c_7_hs gQ5868339lref| gn 5 + 88053 88461 ex 3 3 CDS113.78 409 5775 

CK07 hs gl(5868339 1.4 
65 338282 EOS38213 CH22J897H3_UNK_EWAC005500.GENSCAN.2914 

CH22_EMAO005500.GENSCAN^9M 1.4 
337895 EOS37826 CH22L.6303FG__UNK.EMAC005500.GENSCAN.55-2 

CH22_EMAC005500.GENSCAN.56-2 1.4 

320330 EOS20251 AF026004 Hs.141660 chloride channel 2 1.4 

70 314302 EOS14233 AA813118 Hs.163230 ESTs 1.4 

313280 EOS13211 AI285537 Hs.222830 ESTs 1.4 
333222 EOS33153 CH22_459FG_1 05_2_UNK_EfAAC000097.G£MSCAN.1 09-6 

CH22_FGENES.105_2 1.4 

305726 EOS05657 AA828156 EST singleton (not in UniGene) with exon hit 1.4 

75 312674 EOS12605 AI762475 Hs.151327 ESTs; Moderately similar to lill ALU SUBFAMILY J WARNING ENTRY till [H^apiens] 1.4 

315869 EOS15800 A1033547 Hs.1 32826 ESTs 1.4 
327010 EOS26941 c21_hs gl|5857664iretl gn 12+ 941057 941 139 ex 99 CDSI 7.4483790 

CH.21_hs $5367664 1.4 
325892 EOS25823 c16_hs gl|58670£ffi|refi gn 1 - 10496 10652 ex 2 3 CDSi 3.94 1 55 870 

80 CH.16_hsgi|5867088 1.4 

302575 EOS02506 AF071164 Hs.249171 homeoboxAII 1.4 

301970 EOS01901 AB028962 Hs. 120245 WAA1 039 protein 1.4 

332207 EOS32138 H61475 Hs.237353 EST 1.4 

__ 316024 EOS15955 AA707141 Hs.193388 ESTs 1.4 

85 314599 EOS14530 AW206512 Hs.1 86996 ESTs 1.4 
333585 EOS33516 CH22_846FG_203_4_UNK_EMAC005500.GENSCAN.74^ 
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CH2£_FGENES.203L4 1.4 

324670 EOS24601 A1525557 EST duster (not in UniGene) 14 

321307 EOS21238 R85409 EST cluster (not in UniGene) 14 

335170 EOS35101 CH22_250SFG_503J^UNK_E\tACOD5500,GENSCAN.397-1 

J CH22_FGENES.503 1 1.4 

328274 EOS2820S c_7Jis gi|586B219|ref| ©n 2 - 31244 31439 ex 1 fl COSI ia06 196 9 

CH.07_hsgJI5B6B219 1.4 

336880 EOS3S811 CH22_4619FG_31B_8_ CH22_FGENES.31B-8 1.4 

313825 EOS13756 AA215470 EST cluster {no! in UniGene) 1.4 

10 318410 E0818341 AI13B418 Hs.144935 ESTs 1.4 
335361 EOS35292 CH2^2710FG.541J1JUN1^B*AC005500.GENSCAN.431-16 

CH22.FGBIES.541 11 1.4 

319802 EOS19733 A1701489 Hs.202501 ESTs 14 
334769 EOS34700 CH22_2081 FG_4fflAUMK_EM^CC055D0.C^NSCAhl29 W 

15 CH22_FGENES.429_4 1.4 

312709 EOS12640 AW0691B1 Hs.141146 ESTs; Weakly similar to transfonTiaflon-feJated protein p-Uapiens] 1.4 
330004 EOS29935 c16_p2 gI|6623963(gb|A gn 5 - 7BH72 78999 ex 2 6 COS1 19.93 128 728 

CR16_p2gi|6623963 1.4 

_ A 313103 EOS13034 AI184303 Hs.143806 ESTs 1.4 
20 328359 EOS26290 d 6Lhs gf |58672B3Jref| gn 1 9436 9494 ex 2 3 COSi Z165988 

CH.18_hsgi|5867293 1.4 

305211 EOS05142 AA668563 EST singleton (not in UniGene) with exon hit 1.4 

334628 EOS34559 CH22J 936FG.41 6_4_UNK_EM:AC005500.(ENSCAR2774 

CH22_FGENES.416_4 1.4 
25 326919 EOS26850 c21_hs g]|5456782|red gn 2 - 40485 41046 ex 1 6 CDS) 17.70 561 157 

CK21 hsgi{6456782 1.4 

315527 EOS15458 AI791138 Hs.116768 ESTs" 1.4 

306090 EOS06021 AA908609 EST singleton (not in UniGene) with exon hit 1.4 

303316 EOS03247 AFD33122 Hs.14125 p53 legated PA26 nuclear protein 1.4 

30 303642 EOS03573 AW299459 EST cluster (not in UniGene) with exon hit 1.4 

314357 EOS14288 AA781795 Hs.122587 ESTs 1.4 

337102 EOS37033 CH22_5033FG_472_7_ CH22_FG_NE&472-7 1.4 

304384 EOS04315 AA235482 Hs.62954 ferritin; heavy polypeptide 1 1.4 

315117 EOS15048 AA828809 Hs. 192044 ESTs 1.4 

35 305750 EOS056B1 AA835250 EST skigteton (not In UniGene) with exon hit 1.4 

311726 EOS11657 AW081766 H&253920 ESTs 1.4 
326996 EOS26927 c21_hs g!|58S7660|refj gn 4 - 63212 63404 ex 2 6 COSi 15.70 193622 

CR21_hsgi[58676S0 1.4 
330257 EOS30188 c Jj)2gi|6671 881 Jgb(Agn 2- 143228 143393 ex 1 9 CDS1 11.31 166 586 

40 CH05_p2gl|667t881 1.4 

323864 EOS23795 AA340724 Hs.214028 ESTs 1.4 
338204 EOS38135 CH22_6773FG_UNICEIAAC005500.GENSCAN.241-3 

CH22_EWAC005500.GENSCAN.24W 1.4 

314025 EOS13956 AI983981 Hs.189114 ESTs 1.4 

45 315974 EOS15905 AWQ29203 Hs.191S52 ESTs 1.4 
335599 EOS35530 CH22JZ957FG_58L39_UNieEWbACC05500.GENSCAN.47&^7 

CH22_FGENES.581_39 1.4 
335364 EOS35295 CH22J713FG_543XUNK^EM^C005500.GENSCAN.432-4 

CH22_FGENES.543_2 1.4 

50 303634 EOS03565 Al 95 3377 Hs.169425 ESTs; Weakly similar to predicted using Genefinder [CelegansJ 1.4 

315626 EOS15557 AA808598 Hs35353 ESTs; Weakly similar to H21P03.2 [Celegans] 1.4 
329936 EOS29867 c16_p2 gtJ6165200Igb|A gn 4 - 82761 82920 ex 3 4 COSi 1.1 5 160 199 

Oi16_p2gi|6165200 1.4 
328632 EOS28563 c_7Jsgi|5868247|reflfln U 76734 76853 exl 4 CDSf 11951203764 

55 CH.07Jsgl|586B247 1.4 

330207 EOS30138 c_5_p2gi(6013606|gb|Agn 3 -109912 110004 ex 2 4 COSi 6.5493174 

CH.05j>2gi|6013606 1.4 
329919 EOS29850 c16j>2 gil6223624|gb|Agn 6- 103492 103681 ex 1 8 CDSl 6.1819093 

CH.16_p2gi]6223624 1.4 

60 331916 EOS31847 AA446131 Hs.124918 ESTs 1.4 

317617 EOS17548 T58194 EST cluster (not in UniGene) 1.4 

331943 EOS31874 AA453418 Hs.176272 ESTs 14 

306413 EOS06344 AA973288 EST singleton (not in UniGene) with exon hit 14 

313607 EOS13538 N94169 Hs.194258 ESTs; Moderately similar to iiil ALU SUBFAMILY SC WARNING ENTRY ill! {H.saptens] 14 
65 338292 EOS36223 CH22 3691FG.783_3_UNICBA354l12.GENSCAN.4-7 

CH22_FGENES.783_3 14 

330453 EOS30384 HG3976-HT4246 F^Oorr^nDrmBin_ngFac^Rrl t Piturtary-Spectfc 14 

324602 EOS24533 AA503620 Hs.213239 ESTs . 14 

332183 EOS32114 H08225 Hs.177181 ESTs 14 

70 320032 EOS19963 AI699772 H&202361 ESTs; Weakly similar to X-flnked retinopathy protein [Rsapiens] . 14 
333156 EOS33087 CH22_387FG89_6_UNIC.EVLAC000097.GENSCAN.84-8 

CH22.FGENES.89_6 14 
334156 EOS34087 CH22_1435FG_340 6 UNK__MAC005500.GENSCAN. 190-7 

CH22_FGENES.340_6 1.4 
75 334303 EOS34234 CH22_1594FG 373J_UNK_ENtAC005500.GENSCAN.233^ 

CH22 _FG£NES.373_6 1.4 
325513 EOS25444 c12Jis giJ6017035|refI gn 1 - 34295 34490 ex 2 7 COSi 6.49 196 2471 

CH.12_hsgi|6017035 14 

302758 EOS026B9 AA984563 EST cluster (not In UniGene) with exon hit 14 

80 329557 EOS29488 c10_p2 gi|3962492|gbJA gn 6 - 53197 53647 ex 2 2 CDSf 37.6B 451 247 

CH.10j>2gl|3962492 1.4 

331717 EOS31648 AA190888 Hs.153881 ESTs; Highly similar to NY-REf«2 antigen [H^apfens] 14 
3258B5 EOS25816 c16Js £p)5667087|refl gn 11 + 193212 193377 ex 1 3 CDSf 43.19 166 792 

CH.16_hsgi|5B67087 14 

85 312160 EOS12091 AA805903 Hs.184371 ESTs 1.4 
328882 EOS28813 c_7Jis gl|6552423|re11 gn 2 - 157669 157826 ex 4 6 CDSi 4.91 158 6200 
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339028 




CH.07_hsgi|6552423 


1.4 


EOS38959 


CH2^7925FG_UNKJ>A59H1B.GENSCAN.22-8 




323497 




CH22_DA59H18.GENSCAN.22-8 


1.4 


EOS23428 


A1523613 Hs-221544 ESTs 


1.4 


*4 cum 
316697 

312479 


EOS16628 


AAB381 14 EST cluster (not In UniGene) 


1.4 


EOS12410 


A1950844 Hs.1287.38 ESTs; Weakly similar to non-tens beta garnrna-cryslallin like protein (lisapiens} 


1.4 


338535 


EOS38466 


CH22_7251 FG_UNrCEM:ACO056OQ.GENSCAN.4Q4-3 




312754 




CH22_EM*OJ05500.GENSCAN.4Q4-3 


1.4 


EOS12685 


R99834 Hs_2S0383 ESTs 


1.4 


327527 


EOS27458 


cXhsgi|6381882|reflgn2 - 98950 99040ex48CDa 5.7891 1768 




324714 




CH.02_hsgi]6381B82 


1.4 


EOS24645 


AA574312 Hs^45737 ESTs 


1.4 


302347 
338008 


EOSQ2278 


AF039400 Hs. 194659 chloride channel cakaum activated; family member 1 


1.4 


EOS37939 


CH22.6490FG_UNJCENtAC005500.GENSCAN.127-9 








CH22_E^C005500.GENSCAN.127-9 


1.4 


315590 


EOS15521 


AA640637 Hjl225B17 ESTs 


1.4 


320825 


EOS20756 


NM.004751 EST cluster (not in UniGenB} 


1.4 


300930 


EOSO0861 


AI289481 Hs.136371 ESTs 


1.4 


335225 


EOS35156 


CH2^2564rG_513_10_UNK.EIAAC005500.(2NSCAN.406-9 








CH22_FGENE&513_10 


1.4 


337303 


EOS37234 


CH22_5442FG_681_5_ CH22_FGENES.681-5 


1.4 


317198 


EOS17129 


AI810384 Hs.128025 ESTs 


1.4 


308991 


EOS08922 


AIB79831 EST singleton (not in UniGene) with axon htt 


1.4 


325472 


EOS25403 


d?_hs g1|60l7034|ref| gn 7 - 289581 289657 ex 2 6 CDSi 4.74 77 1786 




301266 




OL12_hs gl|6017034 


1.4 


EOS01197 


AA829774 EST duster (not In UniGene} with exon hit 


1.4 


330901 


EOS30832 


M157818 H&238360 Human endogenous retroviral protease mRNA; completa cds 


1.4 


313406 


EOS13337 


AI248314 Hs.132932 ESTs 


1.4 


301454 


EOS01385 


A1751738 EST cluster (not in UniGene) with exon hit 


1.4 


317269 


EOS17200 


AA906411 Hs.127378 ESTs 


1.4 


336876 


EOS38807 


CH22_7733FG_UNKJXI32l10.GENSCAN.4-2 








CH2Z_DJ32J10.GENSCAN.4-2 


1.4 


328481 


EOS28412 


c_7_hs gS]5868449}mf| gn 1 - 8987 91 80 ex 4 31 CDS1 10.00 194 2103 








CH07_hsgiJ5868449 


1.4 


314022 


EOS13953 


AW452420 Hs.248678 ESTs 


1.4 


307640 


EOS07571 


AI301992 EST singleton (not in UniGene) with exon hit 


1.4 


315541 


E0S15472 


A1168233 Hs.123159 ESTs; Weakly similar to KIAA0668 protein [H.S8pjens] 


1.4 


315489 


EOS15420 


AA628245 Hs.191847 ESTs 


1.4 


327815 


EOS27746 


c_5_hs gi|5867968)refl gn 6 + 70804 71401 ex 2 2 CDS! 27.99 598 1000 








Ca05_hsgl)586796B 


1.4 


339319 


EOS39250 


CH2^8280FG_UNK.BA354I1ZGENSCAN.22-19 








CH22_BA354I12.GENSCAN.22-19 


1.4 


322564 


EOS22495 


W86440 H&118344 ESTs 


1.4 


323812 


EOS23743 


AW081373 Hs.199199 ESTs 


1.4 


303540 


EOS03471 


AA355607 Hs.173590 ESTs; Weakly similar to MMSET type 1 [ftsapiens] 


1.4 


337902 


EOS37833 


CH22_6314FG__UNK.EMAC00550aGENSCAN.56-13 








CH22_E(vtACC05500.GENSCAN.5&-13 


1.4 


335289 


EOS35220 


CH22J2631 FG_527_2_UNK_EfAAC005500.GENSCAN.421-2 








CH2*_FGENES.527_2 


1.4 


327919 


EOS27850 


c_6_hs gi|5868165M gn 6 + 547701 547800 ex 14 14 CDSI -G20 100 505 








CK06_hsgi|5868165 


1.4 


337674 


EOS37605 


CH2^6005FG_lJN)eEiiMC000097.GENSCAN.674 








CH22_EMaC000097.GENSCAN.674 


1.4 


320087 


EOS20018 


AF032387 Hs.1 13265 small nuclear RNA activafing complex; polypeptide 4; 190kD 


1.4 


334939 


EOS34870 


CH22_2259FG_465_3_UNK_EMaC005500.GENSCAN.35W 








CH22_FGENES.4B5_3 


1.3 


303443 


EOS03374 


AA320525 EST cluster (not in UniGene) with exon hit 


1.3 


325929 


EOS2S860 


c16_hs gtiS867125|refl gn 2 - 61715 61996 ex 1 1 CDSo 29.05 282 1594 








CH.16Jtsgif5867125 


1.3 


327745 


EOS27676 


c_5_hs gi|6531959lreil gn 1 - 229066 229124 ex 3 6 CDSI 3.01 59 177 








CH05_hsgi|6631959 


1.3 


335166 


EOS35097 


CH22_2502FG_502_10_UNK^EM-AC005500.GENSCAN.396-25 








CH22_FGENES^02_10 


1.3 


324497 


EOS24428 


AW152624 Hs.136340 ESTa 


1.3 


338374 


EOS38305 


CH22_7017FG__LINK_EMaC005500.GENSCAN.327-1 








CH22_EWAC005500.GENSCANJ27-1 


1.3 


313601 


EOS13532 


R32458 H&257711 ESTs 


1.3 


321415 


EOS21346 


A077596 H&3337 transmembrane 4 superfarrrify member 1 


1.3 


305309 


EOS05240 


AA699717 EST singteton (not In UniGene) with exon hit 


1.3 


330447 


EOS30378 


HG3546-HT3744 Pre-Mma Splicing Factor 812033, AIL Splice Form 1 


1.3 


308578 


EOS08509 


AI7Q8573 EST singleton (not in UniGene) with exon hit 


1.3 


315344 


EOS15275 


AW292176 Hs.245834 ESTs 


1.3 


330503 


EOS30434 


M55024 Human cell surface glycoprotein P3.58 mRNA, partial ods 


1.3 


308227 


EOS0815B 


A1559126 Hs.195188 glyceraJdehyoVS-phosphate dehydrogenase 


1.3 


332222 


EOS32153 


N28271 Hs.176618 ESTs 


1.3 


323961 


EOS23892 


AL044428 Hs.207345 ESTs 


1.3 


314530 


EOS14461 


A1052358 Hs.131741 ESTs 


1.3 


320503 


EOS20434 


NML005897 EST cluster (not In UniGene) 


1.3 


306820 


EOS06751 


AI074408 EST singleton (not in UniGene) with exon hit 


1.3 


304165 


EOS04096 


H73265 EST singleton (not In UniGene) with exon hit 


1.3 


324302 


EOS24233 


AA543008 Hs.136806 ESTs; Weakly similar to UI1 ALU SUBFAMILY J WARNING ENTRY Ull (H^apiens) 


1.3 


319128 


EOS19059 


AA393820 EST cluster (not in UniGene) 


1.3 


317092 


EO317023 


AI286162 Hs.1 25657 ESTs 


1.3 


304998 


EOS04929 


AA821 203 EST singleton (not In UniGene) wifli exon hit 


1.3 


331433 


EOS31364 


H68097 Hs.1 61023 EST 


1.3 


333348 


EOS33279 


CH22_594FG_14O_2_UNK_EMAO005500.GENSCAN^0-2 
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CH2*_FGENES.140_2 1.3 
333619 EOS33550 CH2^880FG^19AUNlCBAAa)05500.GENSCANi7-2 

CH22_FGENES.219J * 1.3 

335903 E0835834 CH2Z_3280FG_635_11_UNK.EfAAC005500.GENSCAN.525-14 

CH22_FGENES.635..11 1.3 
326219 EOS26150 c17Js gl|5857226|ref| gn 11 -264008 264274 ex 35 CDSI 574 2672347 

CH.17>gi|6867226 1.3 

324456 EOS243B7 AW500954 EST cluster (not in UniGene) 1.3 

316405 EOS16336 AA757900 Hs.202624 ESTs 1.3 

314361 EOS14292 AL03B765 Hs.161304 ESTs 1.3 
328545 EOS28477 c_7Jis gj|5B6B487]refj gn 1 - 17547 17722 ex 2 3 CDS! 9.96 176 3284 

CH.07Jisgi|5868487 1.3 
335871 EOS35802 CH2^.3246FG_629_19_UNieEM^C005500.GENSCAN.519-18 

CH22^FGENES.629J9 1.3 

303735 EOS036B6 AA707750 Hs.202616 ESTs; Weakly similar to ds-Gotg! matrix protein GM130 [Rnorvegicus] 1.3 

324046 EOS23979 AA378739 EST cluster (not In UniGene) 1.3 

326720 EOS26651 c20_hs gi|6552456)reJ| gn 1 +84525 84677 ex 5 7 COS 1 1.78 153 1031 

CH.20_hsgiI6552456 1.3 

322309 EOS22240 AF086372 EST duster (not In UniGene) 1.3 

322136 EOS22067 AF075083 EST cluster (not in UniGene) 1.3 

313460 EOS13391 AWQ28655 Hs.136033 ESTs 1.3 

306275 EOS06206 AA936312 EST singleton (not in UniGene) with exon hit 1.3 

321974 EOS21905 N76794 EST cluster (not in UniGene} 1.3 

327600 EOS27531 c_3_hs gi|60044621refl gn 1-2621 2862 ex 1 4 CDSI 4.01 242 1407 

CH.03_hsgPXM462 1.3 
329086 EOS29017 c_xjs gjl5868604|re1j gn 1 - 35489 35588 ex 2 9 CD Si 2.55 100 719 

CHXhs gil5868604 1.3 

336919 EOS36850 CH22_4690FG_346_6_ CH2^FGENES.34M 1.3 

302767 EOS02638 H94900 Hs.17882 ESTs 1.3 
334786 EOS34717 CH22_2098FG_432J 1_UNieEMAC005500.G£NSCAN^93-14 

CH2£_FGENES.432J1 1.3 

302472 EOS02403 AA317451 Hs.241451 SWUSNF related; matrix associated; actin dependent regulator of chromatin; subfamily e; member 1 1.3 
333033 EOS32964 CH22^259FG_68_8JLINK_EfvtAC000O97.GENSCAN.4W 

CH2^.FGENES.68_8 1.3 

330493 EOS30424 M27826 Hs-238380 Human endogenous retroviral protease mRNA; complete cds 1.3 

330506 EOS30437 M61906 Hs.6241 plujsphorosHide-3-kinase; regulatory suburb porypsptide 1 (p85 alpha) 1.3 

313932 EOS13863 A1147601 Hs.154087 ESTs 1.3 

314394 EOS14325 A1380563 Hs.130816 ESTs 1.3 

323033 EOS22964 A1744284 Hs.221727 ESTs 1.3 
326431 EOS263S2 c19_hsgi|5867371[ref|gn 1 +15855 15971 ex 46 CDSi 7.791171108 

CR19Jtsgi|5867371 1.3 
335547 EOS35478 CH22_2902FG_576_8_LlNK_EM:AC005500.GENSCAN.467-8 

CH22_FGENES.676 8 1.3 

300548 EOS00479 AI026836 Hs.114689 ESTs 1.3 

316504 EOS16435 AW135854 Hs.132458 ESTs 1.3 
335756 EOS35687 CH22_31 23FG_604_5_UNK_EMAC005500.GENSCAN,493-1 0 

CH22.FGENES.604 5 1.3 

301209 EOS01140 AI809912 Hs.159354 ESTs 1.3 

306610 EOS06541 A10OO635 EST singteton (not in UniGene) with exon hit 1.3 

314439 EOS14370 AI 539443 Hs.137447 ESTs U 

315396 EOS15327 AW296107 Hs.152686 ESTs 1.3 
335914 EOS35845 CH22_3291FG_636_10 UNK^EM:AC005500.GENSCAN.526-10 

CH22_FGENES.636_10 1.3 
333734 EOS33665 CH22_1(K)OFG_260JLUNieEMAC005500 l GENSCAN.119-7 

CH22_FGENES.2£Q_2 1.3 

312370 EOS12301 AA744692 Hs.166539 ESTs 1.3 

304636 EOS04567 AA524031 EST singleton (not tn UniGene) with exon hit 1.3 

323166 EOS23097 AA291001 EST cluster (not in UniGene) 1.3 

335702 EOS38633 CH2^.7482FG_UNK_ER*AC00550aGENSCAN. 480-1 

CH22_EMAC005500.GENSCA>J.480-1 1.3 

322331 EOS22262 AF086467 EST cluster (not in UniGene) 1.3 

318706 EOS18637 A1383593 Hs.159148 ESTs 1.3 

331186 EOS31117 T41159 Hs.8418 ESTs 1.3 
334764 EOS34695 CH22J076FG 4^_13_UNK_EM^C005500.GENSCAW.289-13 

CH22_FGENES.428_13 1.3 
327565 EOS2749S c 3_hs gi|586781 1|refj gn 1 + 32516 32778 ex 2 3 CDSi 020 263 358 

Ca03Jisgi|5867811 U 
335524 EOS35455 CH2^2879FG^572„4_UNieEM;AC005500,G£NSCAN.4614 

CH22_FGENES.572.4 1.3 

308050 EOS07981 AI460004 EST singleton (not in UniGene) with exon hit 1.3 

334172 EOS34103 CH22L1452FG_349_5_UNieEM^C0055CO.GENSCAN.20a€ 

CH22_FGENES.349_5 13 

315674 EOS15605 AA651923 Hs.191850 ESTs 1.3 
334876 EOS34807 CH22.21 90FG_450_6^UNK^EM^C005500.GENSCAN.33^ 

CH22^FGENES.450_6 1.3 

315606 EOS15537 AW298724 Hsi02639 ESTs 1.3 
338779 EOS38710 CH22_7610FG_UNK_ErAAC005500.GENSCAN.526-l5 

Oi22_EMAC005500.GENSCAN.526.15 1.3 
333511 EOS33442 CH2^766FGJ7L5_UNKJEKtAC005500.GENSCAN.5W 

CH22_FGENES.171_5 1.3 
329254 EOS29185 CLXJhs gl|5B68733(refl gn 1 -+ 4133 4214 eoc 1 2 CDSi -0.36 82 2833 

CHX_hsgI|5868733 1.3 

319510 EOS19441 W88633 Hs.254562 ESTs 13 
339418 EOS39349 CH22_8411FG_UNK_DJ579N16.GENSCAN.11-4 

CH22_DJ579N16.GENSCAN.11-4 1.3 

321012 EOS20943 AA737314 EST duster (not in UniGene) 1.3 
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333217 EOS33148 (^4Mre_104_9_UNlK_EMAC000097.GENSCAN.108^ 

CH2^GENES.1Q4_9 1.3 
338581 EOS38492 (^7294FG_JJNK_EIAAC005500.GEN8CAN.421^ 

CH223tAC005500.GENSCAN,421-5 1.3 
5 335742 EOS35673 CH22_3105FG_601_13.UNieEM^C005500.GENSCAN.491-14 

CH22_FGENES,601J3 1.3 
334993 EOS34924 CH2?J314FGJ59J4„UNK^(AAC005500.GENSCAN.36S-16 

CH22_FGENES.469_14 U 

323430 EOS23381 AW0B2479 EST duster (not In UnlGene) 1.3 

10 306069 EOS06000 AA906983 EST singleton {not in UnlGene) with exon hit 1.3 

331681 EOS31612 W85712 Hs.119571 collagen; type HI; alpha 1 (Enters-Cantos syndrome type IV; autosomal dominant) 1.3 
337988 EOS37917 CH22_6441FG_UNICEM^O005500.GENSCAN.110.7 

CH22_ElVtAC005500.GENSGAN.1 10-7 1.3 

313204 EOS13135 AI800S18 Hs.118158 ESTs 1.3 

15 323189 EOS23120 AL121194 Hs.120589 ESTs 1.3 

318171 EOS18102 AA381202 EST cluster (not In UrfGene) 1.3 

307156 EOSO7087 AJ186762 EST singleton (not in UnlGene) with exon hit 1.3 

332713 EOS32644 AA349732 Hs.78489 mutY (E ccli) homolog 1.3 

312828 EOS12759 AI865455 Hs.211818 ESTs; Moderately similar to I'll ALU SUBFAMILY J WARNING ENTRY !!U [ftsapiens] 1.3 

20 301127 EOS01058 AA7581Q9 Hs.121072 ESTs 1.3 

311260 EOS11191 AB72509 Hs.196582 ESTs 1.3 
338364 EOS38295 CH22^.7C07FG_UNK_EMAC005500.GENSCAN.323-7 

CH2^_EfAAC00550aGENSCm323-7 1.3 
337904 EOS37835 CH22_6318FG_UNieEIAAO0O5500.GENSCAN.56.17 

25 CH22_EiytAC005500.GENSCAN.56-17 1.3 

329347 EOS29278 cjUisgi|6456785jrefjgn 1 * 18433 18897 ex 44 COSI 43.39 465 3718 

CRX_hsgiJS456785 1.3 

313329 EOS13260 AW293704 Hs.122858 ESTs 1.3 

314367 EOS14298 AA535749 EST cluster {not in UniGene) 1.3 

30 317098 EOS17029 AI123513 Hs.125456 ESTs 1.3 

306462 EOS06393 AAS83397 EST singleton (not in UnlGene) wilh exon hit 1.3 

301254 EOS01185 AI049S24 EST cluster (not in UniGene) with exon hit 1.3 

335504 EOS35435 W22^2856FG_571J5_UNK^ErvWC005500.GENSGAN.460^4 

CH2?_FGENES.571J5 1.3 
35 334270 EOS34201 CH22^1559FGJ68XUNHXIAAC005500.GENSCAN.22W 

CH22_FGENES.368_2 1.3 
334324 EOS34255 CH22_1616FG 375J_LiNK_E\tACOC5500.G£MSCAN^35-1 

CH2^FGENES.375J 1.3 

304254 EOS04185 AA046273 Hs.1 11334 ferrifa; light pdypepfide 1.3 

40 305731 E0S05682 AA829363 EST singteton (not in UniGene) with exon hit 1.3 

323284 EOS23215 AA279381 Hs.190010 ESTs 1.3 

322007 EOS21938 AW410646 Hs.165739 ESTs 1.3 
334537 EOS34468 CH22J 839FG_4O3_2_LINK^ENfcACO055O0.GENSCAN.268-2 

CH22.FGENES.4Q3J _ 1.3 

45 302360 EOS02291 AJ010901 Hs.198267 mucin 4; tracbeotwwchla! 1.3 

311641 EOS11572 A1948829 Hs.213786 ESTs 1.3 

324643 EOS24574 AI436356 Hs.130729 ESTs 1.3 
327554 EOS27485 c 3Jis gi!5867801|ref] gn 2 - 23092 23191 ex 2 6 CDS1 10.44 100 107 

CH.03 hsgi|5867801 1.3 

50 312165 EOS12096 AW292139 Hs.115789 ESTs" 1.3 

304679 EOS04610 AA548741 EST singleton (not in UniGene) with exon hit 1.3 

319564 EOS19495 AAQ26777 Hs.169732 ESTs 1.3 

310860 EOS10791 AW015920 Hs.161359 ESTs 1.3 

337161 EOS37092 CH22_5180FG_56O_ CH22.FGENES.561-3 1.3 

55 311155 EOS11066 AI634410 Hs.197608 EST 1.3 

336846 EOS36777 CH22_4540FG.263.5_ CH22.FGENES.263-5 1.3 

310985 EOS10916 T51842 EST cluster (not in UnlGene) 1.3 

329499 EOS29430 c10j)2 gl(3983518(gb|Agn 5 +33463 33789 ex 1 1 CDSo 34.50 327 97 

CH.10_p2giI3983518 1.3 
60 334924 EOS34855 CH22J244FG 459^UMK w EMAC0055QlGENSCAN.351-2 

CH22_FGENES.459_2 1.3 

330861 EOS30792 AA084064 Hs.185747 ESTs 1.3 

324658 EOS24589 AI694767 Hs.129179 ESTs 1.3 

323362 EOS23293 AL135067 Hs.117182 ESTs U 

65 330468 EOS30399 L10343 Hs.112341 protease inhibitor 3; skin-derived (SKALP) 1.3 

314198 EOS14129 AA897581 Hs.128773 ESTs 1.3 
339436 EOS39367 CH22L8431FG_UNreDJ579N16.GENSCAN.19-1 

CH2i.DJ579NiaGENSCAN.19-1 1.3 

312483 EOS12414 AJ417526 Hs.184636 ESTs 1.3 

70 321505 EOS21436 H73183 Hs.129885 ESTs 1.3 

332254 EOS32185 N64702 Hs.194140 ESTs 1.3 
328253 EOS28184 c_6Jegi|6381894{reflgn1 -4411 4509 ex 1 5 CDSI 4^0994561 

CR06_hsgi|6381894 1.3 

332357 EOS32288 W73417 Hs.103183 EST 1.3 
75 329017 EOS2894B cjLhs gi|6682532lrefl gn 7 - 255591 255672 ex 33 CDSf 12.948222 

CH.XJb gi]6682532 1.3 

337504 EOS37435 CH22_5739FGJ03_2_ CH22_FGSNES.803-2 1.3 

316625 EOS16556 AA780307 Hs.122156 ESTs 1.3 
335389 EOS35320 CH22_2739FGJ45_1_UNreEMAC005500.GENSCAN.436-1 

80 CH22_FGENES.545J 1.3 

310017 EOS09948 AI188739 Hs.148488 ESTs 1.3 

314354 E0S14285 AL037984 Hs.208982 ESTs; WeaWy similar to till ALU SUBFAMILY J WARNING ENTRY IIII (H^aplens] 1.3 

324641 EOS24572 AI732515 Hs.189218 ESTs 1.3 
335207 EOS35138 CH2i.2546FGL61OJ.UNK_EKtACfJO550O.GENSCAN.4O2^ 

85 CH22_FGENESi10_4 1.3 

333673 E0S33604 CH22_934FGj46.5JJNK.EmC005500.GENSCAN.101-3 
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CH22_FGENES.246_5 t.3 
334370 EOS34301 CH22_1664FG,378_18_UNfeEfAAC00550aGENSCAN^40-1 

CH22_FGENES378_1B 1.3 
3286S0 EOS28621 o_7_hs g1|6588001 |ref] gn 7 • 571207 571274 ex 1 3 CDSl 3.34 6B 4325 

CK.O7Jisgi|6588G01 1.3 

323208 EOS23139 AA203415 Hs, 136200 ESTs 1.3 

307010 EOS06941 AI140014 EST einglaton (not In UnKBsne) with axon hit 1.3 

316563 EOS16494 A1587083 Hs_>00558 ESTs; Weakly similar to Nil ALU SUBFAMILY SP WARNING ENTRY till (rlaaptens] 1.3 

312219 EOS12150 H73505 Hs.117874 ESTs 1.3 

3198B4 EOS19815 T73234 EST duster (r*t in UnlGene) 1.3 

334720 EOS34651 CH22J030FG^421_31_UNK^Et«tACOC5500.GENSCAN.282-31 

CH22_FGENES.421_31 1.3 
335836 EOS35767 CH22_3210FG_621 3_UNK_EMAOJ055Q0.GENSCAN.51 3-3 

CH22_FGENES.621_3 1.3 

305448 EOS05379 AA737B94 Hs.29797 rftjosomal protein L10 1.3 

314885 EOS14816 A1049878 * Hs.133032 ESTs 1.3 

320130 EOS20061 A1820675 Hs.203804 ESTs 1.3 

310567 EOS10498 AI691065 H&155780 ESTs 1.3 

323898 EOS23829 AA347566 EST duster {not in UniGene) 1.3 

336132 EOS36053 CH2?_3522FG_703_2_UNK_DA59H18.GENSCAN.9.2 

CH22LFGENES.703 2 1,3 
337958 EOS37889 CH22_6403FG_UNrLEM^C005500.GEKSCAN.98^ 

CH2?_EMAC005500.GENSCAN.9^6 1.3 

305630 EOS05561 AA80450B EST singleton (not in UniGsrts) with exon hit 1.3 

334916 EOS34847 CH22_2235FG_457 7_UNK^:AC0055DaGENSCAN.347-1 

CH22_FGENES.457_7 1.3 
333542 EOS33473 CH22.799FG_178_4_UNieEMAC005500.GENSCAN.594 

CH22_FGENES.178_4 1.3 

331151 EOS31082 RB2331 Hs.164599 ESTs 1.3 

319)95 EOS15026 AA831815 Hs.243788 ESTs 1.3 

331593 EOS31524 N72150 Hs.50193 EST 1.3 

323767 EOS23698 A1807408 Hs.166368 ESTs 1.3 
334561 EOS34492 CH2^.1865FG_405_1„UWK_EAtAO)05500.GENSCAN77(W 

CH22_FGENES.405_1 1.3 

308191 EOS08122 AI538878 EST singleton (not in UnlGene) with exon hit 1.3 

319571 EOS19502 N91399 Hs.220B26 ESTs 1.3 

316200 EOS16131 A1914535 Hs.221377 ESTs 1.3 

305996 EOS05927 AA889336 Hs. 163355 EST 1.2 

318055 EOS17986 A1249193 Hs.145945 ESTs 1.2 

315570 EOS15501 A1860360 Hs.160316 ESTs 1.2 

320792 EOS20723 AW236504 Hs.247020 ESTs 1.2 

331649 EOS31580 W20364 Hs^5412 ESTs; Weakly similar to c29 [Minusculus] 1.2 

303839 EOS03770 245939 EST cluster (not in UnlGene) with exonhH 1.2 

324399 EOS24330 AA814768 Hs.21396 ESTs \2 

317172 EOS17103 AI741232 HS-M6744 ESTs 1.2 

312452 EOS12383 A1692643 Hs.172749 ESTs 1.2 
325482 EOS25413 c12Jisgl{58S6957treflgn3 + 479574a078ex57COSi10251221S96 

CH.12_hs gl|5866957 1.2 

311395 EOS11326 R23313 EST duster (not In UnlGene) 1.2 

336124 EOS36Q55 CH22_3513FG_701_9_UNKJJA59H18.GENSCAN.M 

CH22JGENES.701J 1.2 

320082 EOS20013 AA487678 Hs.189738 ESTs 1.2 

31216B EOS12099 T92251 Hs.198882 ESTs \2 
338000 EOS37931 CH22.6472FG__Ur^EMAC005500.GENSCAR11W 

CH22_EMACO065OftGENSCAN. 119-5 1.2 
338852 EOS38783 CH22_7705FG_UNKJXI246D7.GENSCAN.12-1 

Oi22_W246D7.GENSCAN.12-1 1.2 

312090 EOS12021 N57692 Hs.1 18064 ESTs 1.2 

316480 EOS16411 AI749921 Hs.205377 ESTs \2 
333259 EOS33190 OT^SOOFGJ ISJ.UNK^EM^COQSaiO.GENSCAN^-Z 

CH22_FGENES.118_7 1.2 
335211 EOS35142 CH22_2550FG_51 1_Z.UNlCEMACOO5500.GENSCAN.4O3-2 

CH22_FGENES.511_2 12' 

-321950 EOS21681 AA594780 Hs.172318 ESTs ~ \2 

337937 EOS378S8 CH22_6370FG_UN1CEMACOC5500.GENSCAN.86-1 

CH22_EMAC005500.GENSCAN.86-1 1.2 

316576 EOS16507 AI732114 Hs.193046 ESTs; Weakly similar to K!l ALU SUBFAMILY J WARNING ENTRY 111! [H.sapiens] 1.2 

322770 EOS22701 AA045796 Hs. 159971 SWI/SNF related; matrix associated; ac&i dependent regulator of chromatin; subfamiy b; member 1 12 
329369 EOS29300 cj(Jts fl!|5868842|retj gn 1 - 121 148 121516 ex 3 4 COS &50369 3910 

CHX_nsgij5868842 1.2 

304183 EOS04114 H91161 EST singleton (not in UnlGene) with axon hit 1.2 

339370 EOS39301 CH22_8343FG_UNK^BA232E17.GENSCAN.1.12 

CH22_BA232E17.GENSCAN.1-12 \2 

303941 EOS03872 AW473878 Hs.156110 Irrtmunoglobiilin kappa variable 1D-8 1.2 

302245 EOS02176 H18B35 EST duster (not in UnlGene) with exon hit 1.2 

335255 EOS35186 (>I22_2597FG_517_2_UNK_EMAC005500.GENSCAN.41 1-2 

CH22_FGENES.517_2 1.2 

316610 EOS16541 AW087973 Hs.126731 ESTs 1.2 

314915 EOS14846 M573072 Hs.187748 ESTs; WeaJdy sim3ar to Mi ALU SUBFAMILY J WARNING ENTRY !!!! [Usapiens] 1.2 

315426 EOS15357 AI391486 Hs.128171 ESTs 1.2 
334003 EOS33934 CH22J 281 FG.31 0 _28_UNK_ElVtAC0055CO.GENSCAN.1 67-27 

CH22_FGENES.310_28 1,2 

304350 EOS04281 AA186871 EST singleton (not In UnlGene) with exon Nt 1.2 

325173 EOS25104 A1133215 Hs.144662 ESTs; Moderately similar to !!U ALU SUBFAMILY J WARNING ENTRY l!!l [H-sapiensI 1.2 

312313 EOS12244 AW293341 Hs.122505 ESTs 1.2 
333366 EOS33297 CH22_6l2FG_142_3_UNK_EMy\C005500.GENSCAN.22-6 



89 



WO 02/21996 



PCT/US01/28716 



CH22_FGENES.142_3 1.2 
334970 EOS34901 CH22 w 2291FG_466_3_UNieBAAC006500.GENSCAN.361-2 

CH22_FGENES.466_3 1.2 
33B668 EOS38599 CH22_7441FG_UNK_EMAC005500.GENSCAN.465.1 

5 CH22 - EKkAC005500,GENSCAN.455-1 1.2 

336502 EOS36433 CH22^3926FG_833_8_UN(eDJ579NiaGENSCAN.W 

CH22_FGENES.B33 8 1-2 

309438 EOS09369 AW102802 rte-225787 ESTs; Moderately similar to hypothetical protein [H.saptensj 1.2 
. _ 336194 EOS36125 CH22_3591FGL7l7J20^UNICDA59HiaGENSCAN.20-19 

10 CH22_FGENES.717_20 1.2 

33667B EOS3S609 CH22_4156FG_43_6_ CH22_FGENES.43^ 1 1.2 

321401 EOS21332 W90406 Hs.35962 ESTs 1.2 

306026 EOS05957 M902309 EST singleton (not in UniGene) with axon hit 1.2 

336434 EOS36365 CH22L3B54FG 826_1.UNJeBA232E17.GENSCAN.B-1 

15 CH22_FGENES.826J \2 

31S257 EOS1518B AW157431 Hs^48S41 ESTs 1.2 

328349 EOS28280 c_7Jis gl|5868383|nef] gn 7 - 260704 260B04 ex 2 9 CDS) 4.37 101 621 

CH.07 hs giJ5868383 1.2 

326112 EOS26043 c17Jis giI5867192|ref}gn 1 + 21 5f2725 ex 1 1 CDS 54.87 575 1272 

20 Cai7Jisg![5867192 \2 

333995 EOS33926 CH22_1272FG_310J9_UNK_EMAC00550aGENSCAN.167-18 

CH22_FGENES.310_19 1.2 

323683 EOS23614 A1380045 Hs,225033 ESTs 1.2 

330143 EOS30074 c21j)2gi|4210430[emb|gn 3 + 184737 184848 ex 4 4 CDS! 1.71 112111 

25 Ca21_p2gl|4210430 1.2 

329789 EOS29720 d 4_p2 gi|6469354|embl gn 2 - 118977 119036 ex 1 3 CDSI 1.19601517 

CR14_p2gi}6469354 1.2 

324397 EOS24328 AA307836 Hs.118758 ESTs; Weakly similar to RLF [Ksapiens] 1.2 

308729 EOS08660 A1799766 Hs.208627 EST 1.2 

30 323939 EOS23870 AW499632 Hs.115696 ESTs 1.2 

333444 EOS33375 CH22 w 694FG^153J_UN(CEKUC0055O).GENSCAN^4-1 

CH22_FGENES.153 1 1.2 

306302 EOS06233 AA937901 ESTshg!8ton(notln UniGene) with exon hit 1.2 

313693 EOS13624 AW469180 Hs.170651 ESTs 1.2 

35 316652 EOS16583 AA789249 EST cluster (not in UniGene) 1.2 

332325 EOS32256 T79428 Hs.191264 ESTs 1.2 

336235 EOS36166 CH22_3633FGJ40XUNKJ3A59H18.GENSCAN.44-2 

CH22_FGENES.740_2 . 1.2 

319436 EOS19367 R02750 EST cluster (not in UniGene) 1.2 

40 312335 EOS12266 AW043620 Hs.236993 ESTs 1.2 

322109 EOS22040 AI884327 Hs.244737 ESTs U 

328466 EOS28397 c_7_hs pj|5868434|ref| gn 1 - 15643 15900 ex 1 2 CDS) 2.36 258 1608 

CH.07 Jisgi|5B68434 \2 

323244 EOS23175 T70731 EST duster (not in UniGene) 1.2 

45 312510 EOS12441 AA779907 Hs.117558 ESTs U 

314853 EOS14784 AA729232 Hs.153279 ESTs 1.2 

336946 EOS36877 CH22_4731FG_355_2_ CH22_FGENES.355-2 1.2 

303874 EOS03805 AA258921 EST duster (not in UniGene) with exon hit 1.2 

312658 EOS12589 AA730280 Hs. 120936 ESTs 1.2 

50 308354 EOS08285 AI611044 EST singleton (not In UniGene) with exon hit 1.2 

310073 EOS10004 AI335Q04 Hs.148558 ESTs 1.2 

324777 EOS24708 AA744046 Hs.133350 ESTs 1.2 

300897 EOS00828 AI890356 Hs.127804 ESTs 1.2 

308371 EOS08302 A1620666 Hs.242510 EST 1.2 

55 306358 EOS06269 AA961821 EST singleton (not in UniGene) with exon hit 1.2 

312295 EOS12226 AA578233 Hs. 173863 ESTs \2 

319792 EOS19723 R20317 Hs.22968 ESTs 1.2 

338546 EOS3B477 CH22_72OTG__UNK_EMAC005500.GENSCAN.410.1 

CH22_EMAO)05500.GENSCAN.410-1 \2 

60 314546 EOS14477 AW007211 Hs.186672 ESTs 1.2 

338494 EOS38425 CH22 7184FG_UNK^EM^COO5500.GENSCAN.385^ 

CH22_EM^C005500.GENSCAM.385-5 1.2 

331131 EOS31062 R54797 rk26238 EST; Weakly similar to reverse transcriptase homotog (H^apiensJ 1.2 

309939 EOS09870 AW419122 EST singleton (not in UniGene) with exon hit 1.2 

65 332932 EOS32863 CH22_153FGJ8.6_LINieC20H1ZGENSCAN.2W 

CH22_FGENES>38 6 1.2 

309653 EOS09584 AW196B00 Hs.180842 rifaosomal protein U 3 1.2 

318647 EOS18578 AI526152 EST cluster (not tn UniGene) 1.2 

304044 EOS03975 T52479 Hjk252259 rinosomal protein S3 1.2 
70 330307 EOS30238 c_7_p2 gi|4877982JgbjA gn 2 + 107384 107559 ex 2 4 CDSI 9.96 176 4 

CH.07_p2gq4877982 1.2 

314499 EOS14430 AL044570 Hs.147975 ESTs 1.2 

338053 EOS37984 CH22_6552FG_JJNK_EM*C0055G0.GENSCAN.15&-1 

CH22_EMAC00550D.GENSCAN.158-1 1.2 
75 332991 EOS32922 CH22_215FG_56_4 UNiCEIAAC000097.GENSCAN.174 

CH22_FGENES.56_4 1.2 

306308 EOS06239 AA946870 EST singleton (not (n UniGene) with exon hit 1.2 

338120 EOS3B051 CH22_6655FG_UNK.EM:AC005500.GENSCAN.195-1 

CH22_ENtAC005500.G£NSCAN.19M 1.2 

80 313703 EOS13634 A1161293 Hs.146B62 ESTs; Weakly simlar to KIAA0525 protein [Usapiens] 1.2 

330563 EOS30494 U50553 Hs.147916 DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 3 1.2 

332886 EOS32817 CH22_106FG_33J_UNK_C20H12GENSCAN.22-9 

CH22_FGENES.33_7 1.2 

303844 EOS03775 U94362 Hs.5B589 grycogenln2 1.2 

85 321755 EOS21686 A1215881 Hs.144042 ESTs 1.2 

333532 EOS33463 CH22_789FG_175J9_UNK_EfAAC005500.GENSCAN.53-25 
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CH22.FGENES.175J9 1.2 
332863 EOS32794 CH22JB1FG_2BJJJNK_C20H12.GENSCAN.1M 

CH22.FGENES.28_3 1.2 
_ 333254 EOS331B5 CH22_495FGJ18_2.UNK.BfMO00K00.GENSCAN.2-2 

5 CH22JGENES.11BL2 1.2 

317459 EOS17390 AI367254 Hs.131248 ESTs 1.2 

315353 EOS15284 AW452608 Hs.129817 ESTs 1.2 

300732 EOS00663 A1369956 Hs,257B91 ESTs 1.2 

303502 EOS03433 AA483528 EST cluster (not in UnlGene) wfth exon hit 1.2 

10 333126 EOS33057 (^355FG_82.3JJNK_EMA^097.GENSCAN.66-10 

CH22_FGENE&82.3 1.2 
332929 EOS32860 CH22_150FG_38_3_UN1CC20H12.GENSCAN.2W 

CH22_FGENES.38J 1.2 
329502 EOS29433 c10_p2 gl|39B3517lgb|U gn 1 +75 338ex1 1 COSo 46.82 264 100 

15 CH.10_p2glJ3983517 1.2 

333408 EOS33339 CH22_657FG_146_6_UNK^EfAAC005500.GENSCANi^6 

CH22_FGENES.145_6 1.2 

315472 EOS15403 AA828850 Hs.165469 ESTs 1.2 
328290 EOS2B221 c_7_hs gi|5868363)ref|fin 2 -127366 127496 ex 1 5 COSI 5.24131 289 

20 CK07_hs $5868363 1.2 

328662 EOS28593 c_7_hs gi|6004473|ref| gn 22 +1184773 1184855 ex 7 8 CDS! 12.72 83 3916 

CH.07_hsgi|6004473 U 

319808 EOS19739 T58960 EST cluster (not In UniGene) 1.2 

303920 EOS03860 AW470753 EST singleton (not in UnlGene) with exon hit 1.2 

25 315712 EOS15643 AI950133 Hs.1208B2 ESTs; Moderately simitar to lit! ALU SUBFAMILY J WARNING ENTRY 111! [H^apiens] 1.2 

307391 EOS07322 AI225058 EST singleton (not In UnlGene) with exon hit 1.2 

335499 EOS35430 CH22 J851 FG_571_8_UNK^EKtACOO5500.GENSCAN.46a-2B 

CH_2_FGENES£71_8 1.2 

303792 EOS03723 C75094 Hs.199839 ESTs; Highly similar to NG22 (RsapJens] 1.2 

30 327287 EOS27218 c_1_hsgi|5867479|iBf|gn 1-62838 63024 ex 4 5 COS! 11.66 187 1628 

CR01_hsg55867479 1.2 

317713 EOS17644 AI733306 Hs.128071 ESTs 1.2 
330137 EOS30068 c21_p2gij4210430|emb|gn1-2122021377ex23CCK1 1.89158104 

CU21 _p2gi|4210430 1.2 

35 308157 EOS08088 AI510824 Hs.75988 thymosin; beta 4; Xchromosome 1.2 

314452 EOS14383 AL042699 Hs.209222 ESTs 1.2 

308268 EOS08199 AI567509 Hs. 172928 collagen; type I; alpha 1 \2 

321467 EOS21398 X13075 EST cluster (not in UniGene) 1.2 

320993 EOS20924 AL050145 Ha_25986 Homo sapiens mRNA; cDNA DKFZp586C2020 (from done DKFZp586C2020] 1.2 

40 336778 EOS36709 CH22_4367FGJ59_4_ CH22.FGENES.1594 1.2 

319827 EOS19758 T62778 EST duster (not in UnlGene) 1.2 

308249 EOS08180 AI560998 EST singleton (not in UnlGene) with exon hit 1.2 

310094 EOS10025 AW450987 Hsi35240 ESTs 1.2 

336902 EOS36833 CH22_4655FG_331_2- CH22.FGENES.331-2 1.2 

45 339044 EOS38975 CH22_7944FG_LINK_DA59H18.GENS(m27^ 

CH22_DA59H18.GENSCAN.27-5 1.2 

336675 EOS36606 CH22_4153FG_43_3_ CH22.FGENES.43-3 - 12 

303563 EOS03494 AA367699 Hs. 118787 transforming growth factor; beta-induced; 6BXD \2 

330673 EOS30604 D57823 Hs.92962 Sec23 (S. cerevisiae) homolog A 1.2 

50 311814 EOS11745 AW377113 H&119640 ESTs; Moderately similar to zinc finger protein (H^aptensJ 1.2 
335481 EOS35412 CH2^ - 2^FG_570JO_UNK_EMAC00550aGENSCAN.46(M 

CH22.FGENESi70J0 1.2 

314775 EOS14706 AI149880 Hs.188809 ESTs 1.2 

324961 EOS24892 AA613792 EST cluster (not in UnlGene) 1.2 

55 313458 EOS13389 AA007259 Hs.255853 ESTs 1.2 

307074 EOS07005 AI150989 EST singleton (not In UniGene) with exon hit 1.2 

337964 EOS37895 CH223410FG_UNK^EM*C005500.GENSCM100-9 

OI22_EMAC0(K500.GENSCAN.100-9 1.2 
326519 EOS26450 c19Jis gl|5867439|re11 gn 4 + 166004 166243 ex 4 5 CDS 4.49 240 2534 

60 CH.19 hsgi[5867439 1.2 

337366 EOS37297 CH22_5551FG_736_1_ CH22_FGENES.736-1 1.2 

322340 EOS22271 AF08B076 EST cluster (not in UniGene) 1.2 

307954 EOS07885 AI419692 EST singleton (not In UniGene) with exon hit 1.2 

328615 EOS28546 c_7_hs gi|5858239|ret] gn 2 + 35214 35347 ex 3 4 CDSi 11.49 134 3651 

65 CH.07_hsgl|5B68239 1.2 

317787 EOS17718 AW339612 Hs.249364 ESTs 1.2 
335288 EOS35219 CH22J630FG_527J_UNK^MAC00550aGENSCAR421-1 

CH22.FGENES.527J , 1.2 

323175 EOS23106 AI827137 Hs. 184023 ESTs ' 1.2 

70 330893 EOS30824 AA149620 Hs.71999 ESTs 1.2 

306810 EOS06741 AI057294 EST singleton (not in UnlGene) with exon hit 1.2 

338239 EOS38170 CH22_6833FG_UNICEMAC0DomGENSCML264$ 

CH22_EMAC005500.GENSCAN.2645 1.2 

332347 EOS32278 W60326 H&521716 ESTs 1.2 

75 309782 EOS09713 AW275156 Hs.156110 Immunoglobulin kappa variabJs 1D-8 1.2 

322518 EOS22449 AI133446 EST cluster (not in UnlGene) 1.2 

301187 EOS01118. AA806542 EST cluster (not in UniGene) with exon hit 1.2 

312129 EOS12060 AW300B67 EST cluster (not in UnlGene) 1.2 

334714 EOS34645 CH22_2024FG_421_25_UNK_EMAC00550aGENSCAN282-25 

80 CH22^FGENES.421_25 1.2 

316586 EOS16517 AI205077 Hs.144689 ESTs 1.2 

320488 EOS20419 R31386 EST duster (not In UniGene) 1.2 

327458 EOS27389 c_2_hs gi]6004455|refign 3* 173257 173378 ex 5 7 CDSi 4.031221184 

CR02Jsgf|6004455 1.2 

85 336707 EOS36638 CH2Z.4212FG.64J_ CH22.FGENES.64-3 1.2 

313561 EOS13492 AAD40155 EST cluster (not In UniGene) 1.2 
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330906 EOS30837 AA169498 Hs.72804 ESTa 1.2 

3309B7 EOS30918 H40988 Hs.131965 ESTs; Weakly similar to ALU SUBFAMILY J WARNING ENTRY till [UsapiensJ 1.2 

325041 EOS24972 AI809182 Hs.130907 ESTs 1.2 

313225 EOS13156 AA502384 Hs.151529 ESTs 1.2 

5 305295 EOS05226 AA687131 EST singlston (not in UniGene) with exon hil 1.2 

306898 EOS06827 AI093383 EST singlston {not In UniGene) with axon hit 1.2 

326981 EOS26912 c21_hs gi|85B8016|reflgn 3* 105091 106038 ex 1 1 COSo 122.69 94B 567 

CH.21_hs g!|65B8016 1.2 

332225 EOS32156 N33213 Hs.100425 ESTs 1.2 

10 318802 EOS18733 R13443 Hs.92414 ESTs 1.2 

318413 EOS18344 AI13B592 Hs.144938 ESTs 1.2 

312292 EOS12223 AW451893 Hs.151124 ESTa 1.2 

323753 EOS23684 AA327102 EST duster (not m UniGene) 1.2 

313582 EOS13513 AW207684 Hs.13583 ESTs 1.2 

15 317836 EOS17767 AA983913 Hs.128929 ESTs 1.2 
332868 EOS32799 CH2^86FG_28XUNK.C20H12GENSCAN.1M 

CH22.FGENES.28 8 1.2 

336924 EOS36855 CH22_4699FG„347_9_ CH22.FGENES.347-9 1.2 
327791 EOS27722 c_5_hs gi|5867977|ref| gn 1+22491 22610 ex 67 COSI 11.29 12) 658 

20 CH.05_hsgi|5867977 U 

330717 EOS30648 AA233926 Hs.23635 ESTs 1.2 

322944 EOS22675 AA1 12573 EST cluster (not in UniGene} 1.2 

312108 EOS12039 T62331 Hs.127453 ESTs 1.2 

A _ 332570 EOS32501 AA401376 HsiB176 ESTs \2 

25 330880 EOS30811 AA132420 Hs.53542 K1AA0986 protein 1.2 

310341 EOS10Z72 AW302773 EST duster (not In UniGene} 1.2 

334012 EOS33943 CH22 w 1290FGJ13_3_UrK.EM^C005500.GENSCAN.169-3 

- CH22_FGENES.313_3 U 

318230 EOS1B161 AA558125 EST duster (not In UniGene) 1.2 

30 336071 EOS36002 CH22.3457FG_685_3_UNK.DJ32liaGENSCAN.21-6 

CH22J r GENES.685_3 1.2 
33B510 EOS38441 CH22„7208FG_UNICEKtAC005500.GENSCAN.391.22 

CH22_EMAC005500.GENSCAN.391-22 1.2 
334487 EOS34418 CH22_1786F6_395_9_UNK_EMAC005500.GENSCAN.258-10 

35 CH2LFGENES^95_9 1.2 

320661 EOS20592 AA864846 EST duster (not in UniGene) 1.2 

335200 EOS3513J CH22J538FG_508_9_UNX_EM^C005500.G£NSCAH401-S 

CH22„FGENESi08.9 1.2 
333582 EOS33513 CH22_842FGJ01JLUNK.EMACO05500.GENSCANJ2^3 

40 CH22_FGENES.2Q1_2 1.2 

320789 EOS20720 R78712 EST duster (not in UniGene) 1.2 

321185 EOS21116 H51659 Hs.189854 ESTs U 
337740 EOS37671 CH22_6085FG__UNlCEMACOOOOg7.GENSCAN.10W 

CH2^fcW(»(J0097.GENSCAN.10(W 1.2 

45 315064 EOS14995 AA775208 Hs.136423 ESTs 1.2 
334883 EOS34814 CH22L2197FG^451_6_UNK_EMAC005500.GENSCAN.340^ 

CH22_FGENES.451_6 1.2 

331825 EOS31756 AA411144 Hs.1(W768 ESTs 1.2 

319141 EOS19072 F12377 EST duster (not In UniGene) 1.1 

50 333682 EOS33613 CH22_944FG^47_10_UNK_EMAC005500.GENSCAN.1 02-10 

CH22_FGENESi47_10 1.1 
336140 EOS36071 CH22_3530FG_705JJJNK_DA59H18.GENSCAN.10-2 

CH22_FGENE&705_2 1.1 

320727 EOS20658 U96044 EST duster (not in UniGene) 11 

55 323947 EOS23878 AA649842 Hs.186667 ESTs 1.1 

324746 EOS24677 AA603367 Hs.222294 ESTs 1.1 

306744 EOS06675 AI031882 EST singleton (not bi UniGene) wilh exon hit 1.1 

326517 EOS26448 c19Jis gi]5867439[refi gn 1 + 44732 46356 ex 6 6 COS1 148.22 1625 2512 

CK19Jisgi|5867439 1.1 
60 333597 EOS33528 CH22_858FG_21 1JJJNKjENWC005500.GENSCAN.79-5 

CH22_FGENES^11_5 1.1 
330135 EOS30066 c21_p2 gi|4456470|emb| gn 2 -121583 121885 ex 2 2 COSf 18.67 303 102 

CH.21_p2gq4456470 1.1 

315118 EOS15049 AA564921 Hs.143899 ESTs 1.1 

65 302693 EOS02824 AL117539 Hs.173515 Homo sapiens mRNA; cDNA DKFZp586H021 (from done DKFZp5B5H021) 1.1 

337169 EOS37100 CH22L51B9FGJ63 1_ CH22_FG£NES.563-1 1.1 
336121 EOS36052 CH22_3510FGJ01_6_UNK_DA59H18.GENSCAN.&« 

CH22J=GENEa701_6 1.1 

323332 EOS23263 AI829520 Hs.227513 ESTs 1.1 

70 320911 EOS20842 A1056872 Hs.133386 ESTs 1.1 
327990 EOS27921 c_6_hs gi|586B218Jrefl gn 2 - 36225 36503 ex 1 2 COS1 16.35 279 1419 

CR06Jtsgi|5868218 1.1 

320425 EOS20356 C14069 Hs.201627 ESTs; Moderately similar to III! ALU SUBFAMILY SQ WARNING ENTRY (III (H.saptens] 1.1 
327075 EOS27006 c21 Jis gl|6531965jref| gn 58 + 4041318 4041431 ex 4 4 CDS! 1.79 1 14 1285 

75 ai2Usgi|6531965 1.1 

3143B4 EOS14315 AA535840 Hs.162203 ESTs; Weakly similar to alternatively spliced product using exon 1 3A (H .sapiens] 1.1 
338716 EOS38647 CH22_7502re_UN)CBMC005500.GENSCAN.488-9 

CH22JEM*C005500.GENSCAN.488-9 1.1 

330BB6 EOS30817 AA135606 Hs.1B9384 ESTs; Weakly similar to !I!l ALU SUBFAMILY J WARNING ENTRY till [Rsapiens] 1.1 
80 327331 EOS27262 c_1_hs gi|5867516Jretj gn 4 - 55606 55737 ex 2 6 CDS 7.01 132 2349 

CK01Jisgi|5867516 1.1 
326714 EOS26645 c20_hs gl|5867595|ref| gn 2+ 124490 12456B ex 56 CDSI 0.11 79 1020 

CH.20J»sgq5BS7595 1.1 

316734 EOS16665 AW080237 Hs-52884 ESTs 1.1 

85 311660 EOS11591 AI9785B3 Hs*232161 ESTs 1.1 

312757 EOS12688 AI285970 Hs.183817 ESTs 1.1 
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331686 EOS31617 WB8502 Hs.18225B ESTs W 
337840 EOS37771 CH22_6223FG_UNK.EM^C005500.GENSCAN.26-9 

CH2ZJMAW055QQ.GENSCAN_S-9 1.1 

332093 EOS32024 AA608794 Hs.112592 ESTs 1.1 

5 319595 EOS19526 HB1361 Hs.194485 ESTs 1.1 

3159S0 EOS15921 AIB00041 Hs.190555 ESTs 1.1 

32243B EOS22369 W44531 Hs.167851 ESTs 1.1 
332965 EOS32896 CH22J 89FG_50_3_LI^EMAC000097.GENSCAN.3-5 

CH22_FGENES.50J M 

10 337182 EOS37113 CH22_5204FG_570JL CH22.FGENES.570.2 1.1 
334948 EOS34879 CH22_2269FG_465J5JUNK_EMAC0055Q0.GENSCAN.359-13 

CH2^GENES.465J5 1.1 . 
325864 EOS25795 c16Jsgi|5867069|ra1)gn2.110834110904ex33CDSf 9.7671 457 

CH.16_hsgI|5887069 1.1 
15 337760 EOS37691 OT2?_6110FG_UNK.EM:AC000097.GENSCAN.116^ 

CH2*_EMAC00IW97.GENSCAN.11W 1.1 

315422 EOS15353 AW135357 Hs.192374 ESTs 1.1 
338889 EOS38820 CH2^7746FG_UNK_DJ32I10.GENSCAN.7-1 

CN2ZJXI32110.GENSCAN.7-1 1.1 
20 332961 EOS32B92 CH2?_1B5FG^48J8.UNKJEMAC000097.GENSCAN2-14 

CH22J : GENES.48_1B 1.1 

314703 EOS14634 AI791249 EST cluster (not In UniGene) 1.1 

317791 EOS17722 A180150Q Hs.128457 ESTs 1.1 
333680 EOS33611 CH22^942FG_247J_UNK_EWAC005500.GENSCAN.102-7 

25 CHQJGENEBJffJ 1.1 

322419 EOS22350 AA2489B7 Hs.14084 ESTs; Highly similar to z!nc RING feiger protein SAG [M.musculus] 1.1 
338124 EOS38055 CH22_6661FG_UNK„EIAAC00550aGENSCAN. 196-2 

CH22 EMACC05500.GENSCAN.196-2 1.1 

308884 EOS08815 AI833131 Hs.179100 ESTs 1.1 
30 333349 EOS33280 CH22_595FG_140_3_UNK_EM^C005SOO.GENSCArJ^W 

CH22_FGENES.140_3 1.1 

313150 EOS13081 AA824410 Hs.165003 ESTs 1.1 
339208 EOS39139 CH22Jt46FG_UNKJ : F1 13D11.GENSCAN.6-3 

CH22.FF113D11.GENSCAN.&-3 1.1 
35 335653 EOS35584 CH22_3013FG_590_4JJNK„EM^COC5500.GENSCAN.4844 

CH22.FGENES.590 4 1.1 

319524 EOS19455 AA682B65 Hs.194441 ESTs v 1.1 

301576 EOS01507 AI682905 Hs.146875 ESTs; Weakly similar to !!!l ALU SUBFAMILY J WARNING ENTRY [H.sapiens] 1.1 

317598 EOS17529 AW206035 Hs.192123 ESTs 1.1 
40 333473 EOS33404 CH2?_724FG_16^3_UNK.EIAAC005500.GENSCAN.42-10 

CH22J=GENES.162_3 1.1 
333949 EOS33880 CH22_1225FG_303_5_UNK_EJAACOJ5a)0.GENSGAM162-9 

CH22„FG£NES.303_5 1.1 
339256 EOS39187 CH2^8207FG_UNK.BA354I1ZGENSCAN.7-11 

45 CH22_BA354112.GENSCAN.7-11 1.1 

332884 EOS32815 CH22L104FG_33_5.UNieC20H12.GENSCAN.22.7 

CH22_FGENES.33_5 1.1 

314660 EOS14591 AA436007 Hs.188780 ESTs 1.1 
333220 EOS33151 CH22_457FG_1 04 J2_UNK_B*AC00Q(K7.GENSCAN.1 08-11 

50 CH22.FGENES.104J2 1.1 

308106 EOS08037 AI476803 EST Sffigleton (not In UniGene) with exon hii 1.1 

320709 EOS20640 AA456660 Hs.154165 ESTs 1.1 

'307612 EOS07543 AI290787 EST singlelon (not in UniGene) with exon hit 1.1 

330286 EOS30217 c„5_p2 gi|6671913|gb|Agn 2 - 31050 31 171 ex 27 CDSi 8.84122791 

55 CH.05j>2gi|6871913 1.1 

304495 EOS04426 AA446446 EST singleton (not in UniGene) with exon hit 1.1 

310583 EOS10514 AW205632 Hs.211198 ESTs " 1.1 

332896 EOS32827 CH22L117FG.35_10UNreC20H1ZGENSGAN.24-9 

CH22_FGENES^5J0 1.1 
60 337602 EOS37533 CH22_5895FG_UNK_C20H1ZGENSCAN.15-1 

CH22.C20H1ZGENSCAN.15-1 1.1 

307626 EOS07557 AI300035 ESTsiiglaton (not tn UniGene) with exon hit 1.1 

334696 EOS34627 CH22_2006FG 421 5 UNrCEMAC005500.GENSCAN.282-5 

CH22_FGENES.421_5 1.1 

65 318652 EOS18583 T53259 EST duster (not in UniGene) 1.1 

337844 EOS37775 CH22 6229FG_UNK_EM:AC0055rjO.GENSCAN.3O-9 

fJH22_EMAC00550aGENSCAN.30-9 1.1 
334823 EOS34754 CHZ5L2137FG 437 5 UNrCEMAC0Q5500.GENSCAN.301-7 

CH22.FGENES.437J 1.1 
70 333928 EOS33859 CH22_1201FGL299_2_UNK^EMAC005500.GENSCAN.158^ 

CH22_FGENES.299_2 1.1 

337503 EOS37434 CH22_5738FG_803JL CH22_FGENESJM)3-1 1.1 

323044 EOS22975 AA148725 Hs.154190 ESTs 1.1 
329164 EOS29095 c_xjis gi|5868691lnif| gn 1 +62305 62517 ex 22 CDS! 17.51 2131868 

75 CH.X_hsgi]5B68691 1.1 

335468 EOS35399 CH22_2819FG 567_4JJNK^EM*CO055O0.GENSCAN.454-12 

CH2Z_FGENES.567_4 1.1 
338962 EOS3B893 CH22_7838FG_UNK_OJ32l10.GENSCAN.23-39 

CH22JXI32M O.GENSCAN.23-39 1.1 

80 323570 EOS23501 AL038623 Hs.208752 ESTs; Weakly similar to (III ALU SUBFAMILY SX WARNING ENTRY (II! [H-sapiens] 1.1 
333568 EOS33499 CH22_826FG 185 1 UNrLEJAAC005500.GENSCAN.64-1 

CH22_FG_NES.185_1 1.1 

331865 EOS31796 AA425756 Hs.98445 ESTs 1.1 
336246 EOS36177 CH22.3644FG_746_5_UNK_OA59H18.GENSCAN.4W 

85 CH22_FGENES.746_5 1.1 

337238 EOS37169 CH22_5343FG_641 3 CH22.FGENES.641-3 1.1 
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305089 


EOS05020 


AA642622 EST singleton (not In UnlGene) with exon hit 




1.1 


300097 


EOS00028 


A1916973 Hs.213603 ESTs 




1.1 


313134 


EOS13065 


N63406 Hs.258697 ESTs 




1.1 


337452 


EOS37383 


ai22_5665FG_775J_ CH22_FGENES.775-1 




1.1 


325433 


EOS25364 


c12Jis gi|5866936|refi gn 4 - 480706 480826 ex 3 4 COSI 1.99 121 818 










CR12_hs $5886936 




1.1 


335999 


EOS35930 


CH2^3380FG.657J_UNieDJ246D7.GENSCAN.1M 










CH22_FGENES.657J 




1.1 


333580 


EOS33511 


CH22_840FG_1 99JtUNlCEfcftAOQ05500.6ENSCAN.71-2 










CH22_FGENES.199_2 




1.1 


336836 


EOS36767 


CH22_4512FG_247_11_ CH2^FGENESJ247-11 




1.1 


334677 


EOS34608 


CH22 ta .18B6FG_418_30_UNK_EMAC005500.GENSCAN.279J1 










CH22_FGENES.418.30 




1.1 


329062 


EOS28993 


cj^hs gil5868590lrefl gn 3 - 58977 59094 ex 4 1 1 CDS! -6. 19 118 627 










CRXJis gl|5868590 




1.1 


333671 


EOS33602 


O^93ffGJ45_5_UNICEMAC005500.GENSCAN.100.12 










CH22_FGENES.245_5 




1.1 


304941 


EOS04872 


AA812612 EST singleton (not in UnlGene) with exon hit 




1.1 


315772 


EOS15703 


AW515373 Hs.158893 ESTs 




1.1 


301281 


EOS01212 


AAB43966 Hs. 190586 ESTs 




1.1 


333520 


EOS33451 


CH22L777FG_174.3JJNK_EM-\C005500.GENSCANiW 










CH22_FG£NES.174_3 




1.1 


315203 


EOS15134 


A1559820 Hs.199438 ESTs 




1.1 


315927 


EOS15858 


AW025517 Hs.133250 ESTs 




1.1 


317161 


EOS17092 


AA972165 Hs. 150308 ESTs 




1.1 


337692 


EOS37623 


CH22J028FG_UN1CB^C000097.GENSCAN.78.12 










CH22_EIAACC00097.GENSCAN.78-i2 




1.1 


331472 


EOS31403 


N24830 yx70a0ls1 Scares melanocyte 2NbHM Homo sapiens cDNA clone 1MAGE:267050 3* similar to 








gb]M8791 2|HUMALNE662 Human carcinoma ceD-derived Ahi RNA trans 


cftpt,(rRNA);containsA!u 








repetitive element;, mRNA sequence. 




1.1 


336439 


EOS36370 


CH22.3859FG 827 4 UNK_DJ579N16.GENSCAN.1-3 










CH22_FGENES.827 4 




1.1 


326882 


EOS26813 


c20Jis gi|6682509)ref| gn 2 - 167988 168179 ex 4 4 CDSf 18.69 192 2238 










CH.20Jisgi|6682509 




1.1 


335977 


EOS36908 


CH22_4793FG_380J CH2zTfGENES.380-9 




1.1 


333983 


EOS33914 


CH22J 260FG.31 ol7_UNK_EMAC00550Q.GENSCAN.1 67-5 










CH22_FGENES310_7 




1.1 


32887B 


EOS28809 


c_7jis gJI6552423|ref] gn 1 ♦ 105580 105774 ex6 7 COS! Z91 195 6195 










CH.07_hsgi|6552423 




1.1 


330415 


EOS30346 


D83777 Hs.75137 KIAA0193 gene product 




1.1 


324824 


EOS24755 


AI826999 Hs.224624 ESTs 




1.1 


325815 


EOS25746 


c14_hs giJ6682483|refj gn 1 - 129273 130754 ex 1 1 CDSo 1 1.82 1482 2225 










CU14_hsgi|6682483 




1.1 


300463 


EOS00394 


N52510 Rs.186470 ESTs 




1.1 


335708 


EOS35639 


CH22_3069FG 599 8 UNKJMACOO5500.GENSCAN.49O-11 










CH22_FGENES.599_8 




1.1 


324575 


EOS24506 


AW502257 EST cluster (not in UnlGene) 




1.1 


337951 


EOS37882 


CH22_6391 FG_UNK_EIAAC005500.GENSCAN.94-1 










CH22_EMAC005500.GENSCAN.94-1 




1.1 


335935 


EOS35866 


CH22_3313FGJ46XUNK_DJ246D7.GENSCAN.1-5 










CH22JGENES.646J 




1.1 


334914 


EOS 34845 


(>l22J233FG_457J_UNK_EMAC005500.GENSCAN.346-2 










CH22_FGENE&457_3 




1.1 


309527 


EOS09458 


AW150648 Hs.75621 protease inhibitor 1 (antkjlastase); aipha-1-arrtitrypsin 




1.1 


318901 


EOS18832 


AW36B520 Hs.24639 ESTs 




1.1 


320484 


EOS20415 


AA094436 Hs. 15571 2 fbtlstatiivlke 1 




1.1 


333655 


EOS33596 


CH22^926FG_244_1_UNrvEWtAC005500.GENSCAN.99-1 










CH22.FGENES.244J 




1.1 


335860 


EOS35791 


CH22_3235FGJ29J_UNieEMAC005500.ffiNISCAN.51W 










CH22_FGENES.629_5 




1.1 


313339 


EOS13270 


AJ582536 Hs. 163495 ESTs 




1.1 


300149 


EOS00080 


AW448916 Hs.149018 ESTs 




1.1 


318112 


EOS18043 


AW28162 Hs. 132307 ESTs 




1.1 


337807 


EOS37738 


CH22_6178FG_UNK_EMAC00550aGENSCAN.94 










CH22_EMAC005500.GENSCAN.&4 




1.1 


336917 


EOS36B48 


CH22_4668FGJ46_4_ CH22JGENES.34&4 




1.1 


337489 


EOS37420 


CH22_5722FG_799JL . CH22JGENES.799-2 




1.1 


320112 


EOS20043 


T92107 Hs.188489 ESTs 




1.1 


332975 


EOS32906 


CH22L.199FG_51JO_UNK_EMAC000097.GENSCAN.4-12 










CH22_FGENES,51 10 




1.1 


327805 


EOS27736 


c J Jis gi|5867968|ref| gn 2 ♦ 19952 20019 ex 1 2 CDSf 9.47 68 988 










CH.05_hsgiJ5867988 




1.1 


339215 


EOS39146 


CH22J153FG_UNK_FF1 13D1 1.GENSCAN.6-1 0 










CH22_FF113D11.GENSCAN.6-10 




1.1 


311965 


EOS11896 


T69Z79 EST cluster (not In UnlGene) 




1.1 


314043 


EOS13974 


AA827082 EST cluster (not in UnlGene) 




1.1 


333447 


EOS33378 


CH22.697FGJ54J_UNK_EMAC005500.GENSCAN^6 










CH22_FGENES.154J5 




1.1 


333242 


EOS33173 


CH22_481FG_111_6_UNK_EMAC000097.GENSCAN.12tW 










CH22_FGENES.111_6 




1.1 


338596 


EOS38527 


CH22_7343FG_UNK_EMAC005500.GENSCAN.437-2 










CH22JMAO005500.GENSCAN.437-2 




1.1 


329989 


EOS29920 


c16_p2 gi!4567166|gbIA gn 2 + 72861 73052 ex 1 3 CDSf 1 8.02 192 590 








EOS15606 


CH.16_p2gl)4567166 




1.1 


315675 


AA652272 Hs. 197320 ESTs 




1.1 


336722 


EOS36653 


CH22_4245FG_B4_2_ CH22JGENES.B4-2 




1.1 
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334220 EOS34151 CH22_1503FG_359.4_UNieEMAC00550aGENSCAN.2l7-7 

CH2^JGENEa359„4 1.1 

336703 EOS36634 CH22J201FG_56_3_ CH22.FGENES.56-3 1.1 
_ 336397 EOS36328 CH2L3812FG_823Ji.UNJeBA232E17.GENSCAN.6-11 

5 CH22_FGENES.823_12 1.1 

316105 EOS16036 AW295687 Hs.254420 ESTs 1.1 
334661 EOS34592 CH2^1989FG_418_9_UNK^EJtAC00550aGENSCAN.279-13 

CH22JGENES.418..9 1.1 

307783 EOS07714 AI347274 EST singleton (not In UniGene) with exon hit 1.1 

10 333997 EOS3392B CH22J275FG 310 22.UNK_EiytAC005500.GENSCAN.1 67-21 

CH22_FGENES.31Q_22 1.1 

331903 EOS31834 AA436673 Hs.29417 Homo sapiens mRNA; cDNA DKFZp586B0323 (from ckms DKFZp586B0323) 1.1 
328249 EOS28180 c_6_hs gl|63B1891|ref] gn 2 - 96352 96527 ex 2 3 CDSi 6.19 1764550 

Ca06_hsgi]6381891 1.1 
15 338251 EOS38182 M22_6849FG_UNK.EMACX05500.GENSCAN.270-1 

CH22»EHftAC005500.GENSCANi70-1 1.1 

323561 EOS23492 AAB25426 Hs^38832 ESTs; Weakly similar to !!!! ALU SUBFAMILY J WARNING ENTRY !!!! [H.sapiens] 1.1 

301464 EOS01395 AA991519 Hs.253324 ESTs 1.1 
_ _ 335916 EOS35847 CH2^3293FG_636J2_UNK.EMAC00550aGENSCAN < 526-12 

20 CH22_FGENES.636_12 * 1.1 

321828 EOS21759 X56197 EST cluster (not In UniGene) 1.1 

327413 EOS27344 cJLhs gi|5867750jref| gn 3 + 101410 101508 ex 4 5 CDSi 4.34 99 587 

CH.0L.hs gl|5887750 1.1 
334474 EOS34405 CH22_1773FG_394_5_UNieEMAC005500.GENSCAhL257-5 

25 CH22_FGENES.394_5 1.1 

336739 EOS36670 CH22_4291FGJ17J_ CH22_FGENES.117-3 1.1 

316517 EOS16448 AI784315 Hs.123163 ESTs 1.1 
325519 EOS25450 c12_hs gi|6017036|rel| gn 5 - 186804 166915 ex 1 3 COS! 8^6 112 2508 

„ cai2_hsgT|6017036 1.1 

30 333875 EOS33805 (TOJ145FG_291J1 JJNK_EM:AC00550aGENSCAN.14^6 

CH22_FGENES.291J1 1.1 
338221 EOS38152 CH2L6797FG__UNieEM^C005500.GENSCAN,246-10 

CH22.EMAC005500.GENSCAM246-10 1.1 

336878 EOS36809 CH22_4617FG_318X CH22.FGENES.31&5 1.1 
35 337919 EOS37850 CH22J338FG_UNK^EM:AC005500.GENSCAN.6W 

CH22_EMAC005500.GENSCAN.66-5 1.1 

309828 EOS09759 AW293999 EST singleton (not in UnlGene) wflh exon hit 1.1 

305259 EOS05190 AA679225 EST singleton (not in UnlGene) wflh exon hit 1.1 

333922 EOS33853 CH22_1195FG_2S6J3_UNK^EMAC0O5500.GENSCAN.155-16 

40 CH22_FGENES^96_13 1.1 

322092 EOS22023 AF085833 EST cluster (not in UnlGene) 1.1 

313356 EOS13287 AI266254 Hs. 132929 ESTs 1.1 

318847 EOS18778 Z42908 Hs.12308 ESTs 1.1 

337175 EOS37106 CH22_5195FG_567_1_ CK22_FGENES.567-1 1.1 

45 336979 EOS36910 CH22_4802FGJ85J_ CH22.FGENES.385-4 1.1 

312169 EOS12100 A1064824 Hs.193385 ESTs 1.1 
336198 EOS36129 Oi22.3595FG_719JLUNK.DA59H18.GENSCAN.21-2 

CH22_FGENES.719_2 1.1 

321948 EOS21879 AA309612 Hs. 11 8797 ubio^ifin-conjugating enzyme E2D 3 (hornoiogous to yeast UBC4/5) 1.1 

50 324692 EOS24623 AA557952 EST cluster (not in UniGene) 1.1 

330395 EOS30326 D10923 Hs.137555 putative chemokine receptor GTP-binding protein 1.1 
333119 EOS33050 CH22_347FG_80_4_UNK_EM^C000097.GENSCAN.6W 

CH22.FGENES.80 4 1.1 

316012 EOS15943 AA764950 Hs. 11 9898 ESTs 1.1 

55 300142 EOS00073 AI743419 HsJ05707 ESTs 1.1 

317215 EOS17146 AW014242 Hs.159998 ESTs 1.1 
329526 EOS29457 c10_p2gi)3983506|gb]Ugn 2 +12251 12325 ex 3 3 CDSI 7.37 75 178 

CR10_p2gi|3983506 1.1 

317409 EOS17340 AA764968 Hs.4864 WAA0892 protein 1.1 
60 339230 EOS39161 CH22.8171FG_UNK.BA354H2.GENSCAN.1-6 

CH22_BA354i12.GENSCAN.1-6 1.1 

311598 EOS11529 AW023595 Hs^32048 ESTs 1.1 
339164 EOS39095 CH22_8091FG_UNK_DA59H18.GENSCAN.694 

CH22_DA59HiaGENSCAR6W 1.1 
65 326725 EOS26656 c20_hs gi|6552456 W gn 2 - 223005 223125 ex 5 6 CDSi 6.10 121 1038 

CH.20_hsgilS552456 1.1 

330952 EOS30883 H02855 Hs29567 ESTs 1.1 
334621 EOS34552 CH22_1928FG_412_4_UNK_B«AC005500.GENSCAN.27W 

CH22_FGENES.412_4 1.1 

70 3016B5 EOS01616 W67730 EST cluster (not In UniGene) with exon hit 1.1 

308781 EOS08712 AI811707 EST singleton (not in UniGene) with exon hit 1.1 

323413 EOS23344 AA248828 Ha225676 ESTs 1.1 

306723 EOS06654 AI026151 EST singleton (not In UniGene) with exon hit 1.1 

331256 EOS31189 Z41777 Hs.27413 ESTs 1.1 

75 313026 EOS12959 AS355433 Hs.190856 ESTs 1.1 
333002 EOS32933 CH22_226FG_59_3_UNK_EM^C000097.GENSCAN^1-3 

CH22_FGENES.59_3 1.1 

-303011 EOS02942 AF090405 EST cluster (not in UniGene) with exon hit 1.1 

317687 EOS17618 AA972990 Hs.127004 ESTs 1.1 
80 328779 EOS28710 c.7_hs gii5B6B309|ref}9n4+ 41570 41 639 ex 1 5 CDSf 2.65 70 5365 

CH07Jisgi|5868309 1.1 
338707 EOS38638 Oi22_7487FG_JJNK.EMAC00550aGENSCAN.482-2 

CH22_EM:ACO0550D.GENSCAN.482-2 1.1 
337974 EOS37905 CH22L6427FG_JJNK.EMAC005500.GENSCAN. 106-3 

85 CH22_EMACO05500.GENSCAN.1OM 1.1 

332854 EOS32785 CH22_71FG_22_1.UNK.C20H12GENSCAN.15-2 
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CH22.FGENES32J 1.1 

311225 EOS11156 AW4519B2 Hs^48813 ESTs ~ 1.1 

337094 EOS37025 CH22_5018FG_465_19_ CH22LFGENES.4e5.19 1.1 

319357 EOS19288 F13425 Hs2B229 ESTs 1.1 
332958 EOS328B9 CH22.182FG_48J5_UNK_EMAC000097.GENSCAN.2-11 

CH22.FGENES.48_1 5 1.1 

309634 EOS09565 AW193825 EST singleton (not in UniGene) with axon hit 1.1 

321171 EOS21102 AI769410 Hs.221461 ESTs 1.1 

316440 EOS16371 AI954795 Hs.156135 ESTs 1.1 

311665 EOS11596 AW294254 Hs.223742 ESTs 1.1 
327548 EOS27479 c_3Jis gl[5867797|ren gn 2 - 81067 81130 ex 3 7 CDSi 6.42 64 12 

CH.03_hsgi|5867797 1.1 

314940 EOS14871 AW452768 Hs.162045 ESTs 1.1 
326401 EOS26332 c19Jis gi|5667355[refl gn 1 +35165 35332 ex 9 1 1 COS) 0.41 168 788 

CH.19_hsg!{5867355 1.1 
336347 EOS38278 CH22_3759FG_615J_UNICBA232E17.GENSCAN.1-24 

CH22_FGENEa815_3 1.1 

322297 EOS22228 . W76548 Hs.136026 ESTs; Moderately similar to ALU SUBFAMILY SC WARNING ENTRY Nil [Rsapiensl 1.1 

309977 EOS09908 AW451663 EST singleton (not in UniGsne) with exon hit 1.1 

333466 EOS33397 OT22_717FG_161_2_UNKJWkAC005500.GENSCAN.42-2 

CH22_FGENES.161_2 1.1 
329170 EOS29101 cj_hs 5Q5868693JrefJ gn 2 + 67924 68019 ex 6 8 CDSi 3130 95 1882 

CRXJisgl|5868693 1.1 
329479 EOS29410 c10j>2 gi|3983526|gb|A gn 3 - 7425 7561 ex 1 3 CDSi 4.33 137 22 

Oit0_p2gif3983526 1.1 
326668 EOS26599 c20_hs gl|65524551refj gn 1 + 146726 146838 ex 1 1 1 1 CDSI 1.84 1 13 767 

CH.20_hs giI6552455 1.1 

319364 EOS19295 H06538 Hs. 12270 ESTs 1.1 

302988 EOSQ2919 W23986 Hs.34578 +to2frMfrB«taan ' 1.1 

327687 EOS27618 c_4_hs gi|5867847|ref| gn 1 - 16933 169362 ex 2 3 CDSI -0.28 70 782 

CR04_hsgq58S7847 1.1 
339413 EOS39344 CH22_8405FG_UNKJXJ579N16.GENSCAN.^8 

CH22.DJ579NiaGENSCAN.W 1.1 

306156 EOS06087 AA918274 Hs.76067 heat shock 27kD protein 1 1.1 

320858 EOS20789 D59968 EST cluster (not in UniGsne) 1.1 

325447 EOS25378 c12_hsgi|5866941|ref|gn 3 - 372480 372621 ex 2 3 CDSi 9.16142 1026 

CK12_hsgIJ5B66941 1.1 

322696 EOS22627 AI084724 Hs.228468 ESTs 1.1 
329959 EOS29690 d 6_p2 gi|51 03803[g b|A gn 3 + 1 88050 1 881 93 ex 8 8 CDSi 2.0 1 144 361 

CH16_p2gi|5103803 1.1 

312628 EOS12559 AA632817 Hs.190316 ESTs 1.1 
339305 EOS39236 CH22_8262FG_UNK_BA354l12.GENSCAN.21-3 

CH22_BA354H2.GENSCAN.21-3 1.1 

311829 EOS11760 A1078483 Hs.134549 ESTs 1.1 

303270 EOS03201 AL120518 Hs.105352 ESTs 1.1 

321226 EOS21157 AA311443 Hs.251416 Homo sapiens mRNA; cDNA DKFZp5B6E2317 (from clone DKFZp586E2317) 1.1 
335827 EOS35758 CH22_3200FG_620_1_UNK^EMAC005500.GENSCAN,512-1 

CH22_FGENES.620 1 1.1 

336677 EOS36608 CH22_4155FG_43_5_ CH22JGENES.435 1.1 
330081 EOS30012 c19_p2 gll6015314]gb)A gn 1 - 5783 5835 a 4 9 CDSi 288 68 162 

CH19_p2gf|60153l4 1.1 
339313 EOS39244 CH20272FG__UNK_BA354I1 2.GENSCAN.22-1 1 

CH22_BA354i12.GENSCAN.22-11 1,1 

319936 EOS19867 W22152 EST duster (not in UniGsne) 1.1 

332858 EOS32789 CH22_76FG_24_1 UNK_C20H12GENSCAN.16^ 

CH22_FGENES.24J 1.1 

315630 EOS15561 AA648355 Hs.185155 ESTs; Weakly similar to echfooderm mterotubul8-assodaterJ protein-Sloe EMAP2 [H .sapiens] 1.1 
332995 EOS32926 CH22_219FG_58_2_UNK_EMAOXIQ097.GENSCAN.19-2 

CH22_FGENES58_2 1.1 
333441 EOS33372 CH22_691FG_151_5_LINK_EM^C005500.GENSCAN.32^5 

CH22_FGENES.151_5 1.1 
333496 EOS33427 CH2*_748FG_1 68_6_UNK_EfAAC005500.GENSCAN.47-5 

CH2^FGENES.168_6 1.1 
339188 EOS39119 CH22_8123FG__UNK_OA59H18.GENSCAN.72-16 

CH22JJA59H18.GENSCAN.72-16 1.1 

336981 EOS36912 CH22_4818FG_397_7_ CH22.FGENES.397-7 1.1 

312142 EOS12073 AW298359 Hs.221069 ESTs 1.1 

315779 EOS15710 AW015736 HS.21137B ESTs 1.1 

316596 EOS18527 A1470235 Hs, 172698 EST 1.1 
335701 EOS35632 CH22_3062FG_599 1 UNK-EMAC005500. GENSCAN.49D-2 

CH22_FGENES.599J 1.1 

319395 EOS19326 AW062570 Hs.13809 ESTs 1.1 

304236 EOS04167 W93278 EST singleton (not in UnJGene) with exon hit 1.1 

307264 EOS07195 AI202211 EST singleton (not to UnlGene) with exon hit 1.1 

334066 EOS33997 CH22_1344FG_327J1.UNK^fcAC005500.GENSCAN.181-23 

CH22_FGENES.327_21 1.1 
327042 EOS26973 c21_hs gi]6531965|ret] gn 18 - 1380806 1381443 ex 1 5 CDSI 30.85 638 943 

CR2UisgiJ6531965 1.1 
326025 EOS25956 c17_hs gi|5867176N 9" 1 +70854 7091 5 ex 6 8 CDS! -1.46 62 127 

CH.17_hsgi|5B57176 1.1 ■ 
325609 EOS25540 cUJis gi|5866996iref| gn 28 - 981751 981849 ex 1 10 CDS! 1.48 99 101 

<H14Jisgij5866996 1.1 

3199B3 EOS19914 T81429 EST cluster (not in UnlGene) 1.1 

334298 EOS34229 CH2^1589FG_37^_4_UNK_EMAC005500.GENSCAN.232^ 

CH22_FGEMES.372_4 1.1 

323203 EOS23134 AA203135 Hs.130186 ESTs " 1.1 
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305700 EOS05531 AA815428 EST singleton (not in UniGene) with axon hit 1.1 

313304 EOS13235 AI334078 Ha.152438 ESTs 1.1 

310716 EOS10647 AI589618 Hs.192413 ESTs 1.1 
327049 EOS26980 c21_hs gi|6531965|reflgn 24- 1924026 1924110 ex 2 6 CDS 9.43 85 1012 

5 CH.21J1S gi|6531965 1.1 

313749 EOS13680 AW450376 Hs.130803 ESTs 1.1 

307041 EOS06972 AI144243 EST singleton (not in UniGene) wilh exon hit 1.1 

322394 EOS22325 AF077208 EST duster (not in UniGene) 1.1 
, n 326416 EOS26347 c19Jis gl}5667362}refj gn 3 - 45283 45375 ex 3 3 CDSf 5.65 93 923 

1U CH.19_hsgi|5867382 1.1 
333947 EOS33878 CH22„1221F6_303_1_UNK„EMAC00550aGENSCAN.162-5 

CH22J r GENES.303 1 1.1 

324809 EOS24640 AW299534 EST cluster (not in UniGene) 1.1 
330057 EOS29988 c17_p2 Qf|64789S2|gbIA gn 3 75145 75287 ex 3 3 CDSI -Z56 143 1 50 

15 CH.17_p2giI6478962 1.1 
337603 EOS37534 C^5B96FG_UNK_OT1ZGENSCAN.16-2 

CH2i.C20H1ZGENSCAN.16-2 1.1 
332913 EOS32844 CH22_134FG_36JB_UN}CC20H1ZG£NSCAN^8-17 

CH22_FGENESJ6J8 1.1 

20 310026 EOS09957 T24895 Hs.100691 ESTs 1.1 
330153 EOS30084 c21_p2 g)14325335|gb|A gn 2 + 146951 147475 ex 2 2 COS) 25.45 525 233 

CH.21_p2gi|4325335 1.1 
334118 EOS34049 CH22J396FG^330J9_UNieEM:AC0055(KLGENSCAK.185-20 

CH22_FGENES.330 19 1.1 

25 324795 EOS24726 A1494481 Hs.141579 ESTs 1.1 

332530 EOS32461 M316B2 Ms. 1735 inhibbr, beta B(activlnAB beta polypeptide) 1.1 

332048 EOS31979 AA496019 Hs_i01591 ESTs 1.1 
334532 EOS34463 CH22_1834FG_40^13JJWK_ElVtAC005500.GEWSCAN.26S.13 

CH22_FGENES.40i.13 1.1 
30 329762 EOS29693 c14_p2 gi]6048280|emb| gn 3 + 127744 127878 ex 2 4 CDS1 1 1.66 1 35 1054 

CH.14j)2g]|6048280 1.1 
332909 EOS32840 CH22J30FGJ6J3JJNKJX»H1Z(EN$CAN.28-10 

CH22_FGENES.36J3 1.1 

321253 EOS21184 AT69S484 EST cluster (not ta UniGene) 1.1 
35 336572 EOS36503 O_^4007FG_843J2jJNK_DJ579N16.GENSCAN.15-13 

CH22_FGENES.B43.12 1.1 
328768 EOS28699 c_7_hs gl|601 7031 Ireflgn 5 -223741 224238 ex 1 1 CDSo 30.00 498 5285 

CR07Jisgi|6017031 1.1 
334335 EOS34266 CH22_1627FGJ75J2_UN)e-!V_AC0055QaGENSCAN.235.12 

40 CH22_FGENES.375_12 1.1 
334063 EOS33994 CH22_1341FG_327J7_UNieEI^AC0055QO.GENSCAN.181-20 

CH22 - FGENES.327_17 1.1 
333011 EOS32942 CH22_235FG_61XUNK..EWbACC00097.GE^SCAN.23-3 

CH22.FGENES.61_3 1.1 

45 304677 EOSO4608 AA548071 EST singleton (not in UniGene) with exon hit 1.1 

313948 EOS13879 AW452823 Hs. 135268 ESTs 1.1 
334358 EOS34289 CH22_1652FG_37BJ_UNK^EM^C005500.GENSCAN^39-1 

CH22_FGENES.378J 1.1 
328479 EOS28410 c_7 Js gj]5868449[ref| gn 1 -331 560 ex 1 31 CDS1 18.51 230 2100 

50 CK07JisgiI5868449 .1.1 
335813 EOS35744 C^31B5FGJ18J_UNK_EMAC<J05500.GENSCAN.510-1 

CH22_FGENES.618J 1.1 

312430 EOS12361 AW139117 Hs.117494 ESTs 1.1 

324783 EOS24714 AAS40770 EST cluster (not in UniGene) 1.1 
55 337776 EOS37707 OC2^6132FG_UNK_EI^C«J0097.GENSCAN.119-18 

CH223tAC00(MS7.GENSCAN.119-18 1.1 
327205 EOS27136 cj Jis gil5867447|re1j gn 5 + 1 67335 167576 ex 9 9 CDS1 15.50 242 2S) 

CH01Jisgil5867447 1.1 

315198 EOS15129 AI741506 Hs.186753 ESTs; Weakly simflar to i!(! ALU SUBFAMILY J WARNING ENTRY llil [H.sapiens] 1.1 
60 336135 EOS36066 CH22_3525FG_704J^L!^DA59H18.G_NSCAN.9^ 

CH22_FGENES.704_3 1.1 

318558 EOS18489 AW402677 Hs.90372 ESTs 1.1 
328152 EOS28063 c_6_hs gi|5868050Jre1] gn 1 - 73981 74203 ex 1 8 CDSI 31.69 223 341 1 

CH.06_hs gi|5B6806O 1.1 
65 330211 EOS30142 c_5_p2 gi|6013592|gbjA gn 1 * 59158 59215 ex 2 4 CDSi 4.20 58 184 

CH05_p2giI6013592 1.1 
339280 EOS39211 CH22_8234FG_UNK_BA354I12GENSCAN.14-12 

CH_2_BA354I1ZGENSCAN.14.12 1.1 

„ 332045 EOS31976 AA491253 Hs.155045 bromodomain adjacenib zinc finger domain; 2A 1.1 

70 313597 EOS13528 AW162263 Hs_>49990 ESTs 1.1 
329503 EOS29434 c10_p2 gl| 398351 7Jgb|U gn 2 - 1801 1937 ex 1 4 CDS! 4.33 137 101 

CR10_p2giJ3983517 1.1 
333488 EOS33419 (»^740FGJ67_3_UNK_EI^C005500.G_NSCAN.46.10 

CH22_FGENEai67 3 1.1 

75 311960 EOS11891 AW440133 Hs.189690 ESTs 1.1 

320590 EOS20521 U67058 Hs.168102 Human proteinase activated receptor-2 mRNA; 3UTR 1.1 
334047 EOS33978 CH22J325FGJ26_5_UNreEM^C005500.GENSCAN.175.5 

CH22_FG£NES.326_5 1.1 

304782 EOS04713 AA582081 EST singleton (not in UniGene) with exon hit . 1.1 

80 324231 EOS24162 W50827 EST duster (not in UniGene) 1.1 
327212 EOS27143 cj Jb gi|5867463|ref) gn 1 - 42308 42424 ax 5 13 CDSi 6.58 1 17 325 

CH.01Jsgf|5B67463 1.1 
335857 EOS35788 O122_3232FGJ29J_UNK_EM^C00550aGENSCAN.519-1 

CH-Z^FGENESiSfl 1 1.1 

85 317775 EOS17708 AA974603 Hs.181123 ESTs 1.1 

331053 EOS30984 N70242 Hs.183146 ESTs 1.1 
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335940 EOS35B71 CH2?_3318FG.646J3.UNK.DJ246D7.GENSCAN.1.12 

CH22_FGENES.646 13 
322568 EOS22499 W87342 r&209652 ESTs 
314091 EOS14022 AI253112 Hs.133540 ESTs 
313570 EOS13501 AA041455 Hs.209312 ESTs 
300967 EOS00898 AA565209 Hs.19Q216 ESTs 
314544 EOS14475 AA399018 Hs.250835 ESTs 

328321 EOS28252 cjjrs gl|5668373|ref] gn 7- 1029614 1029673 ex1 3 CDSI-2>J0 60 448 

CH.07_hs gi|5868373 
310979 EOS10910 AW445166 Hs.f70802 ESTs 
310730 EOS10661 A1939421 Hs. 160900 ESTs 
318471 EOS18402 AW137725 Hs.146874 ESTs 
315533 EOS15464 AW206191 Hs. 152774 ESTs 

325761 EOS25682 cUJis gi|66B2474|ref| gn 4*130437 130520 ex 6 7 COS] 022841666 

CH.14 hs #682474 
318780 EOS18711 R90906 Hs.113307 ESTs" 

313271 EOS13202 AW444819 Hs. 144851 ESTs; Weakly similar to C09F5.2 [Celegans] 

304546 EOS04477 AA486074 EST singleton (not in UniGene) with exon hit 

330618 EOS30549 X55990 Hs.73839 ribonuctease; RNase A family; 3 {eosinophil caiionta protein} 

332931 EOS32862 CH2^152FG.38.5.UNlCC20H12.GENSCAN^5 

CH22_FGENES.38 5 
336602 EOS38533 CH2^_4047F(L37^4„UNrLEMAC(KI5500.GENSCAN.232-4 

CH22_FGENES.372J 
311185 EOS11116 A1638294 Hs.224665 ESTs 
337585 EOS37616 (}l22_5873FG__UNK_C20H12.G_NSCAN.5-3 

CH22_C2QH12.GENSCAN.5-3 

310249 EOS1O180 AW071751 Hs.13179 ESTs; Moderately similar to IIll ALU SUBFAMILY SQ WARNING ENTRY HI! [KsapiansJ 

314578 EOS14509 AA410183 Hs.137475 ESTs 

310750 EOS10681 AI373163 Hs.170333 ESTs 

333968 EOS33899 CH22.1245FG_307XUNK_EfAAC005500.GENSCAN.165^ 

CH22_FGENES.307_4 
316133 EOS16064 AI187742 Hs.125562 ESTs 

308337 EOS08268 AIB08947 EST singleton {not in UniGene) wilh exon hit 

326160 EOS26091 c17_hs gi|5867254|ref| gn 6- 112000 112137 ex 24 CDSi &Q1 138 1952 

CH.17_hsgi|5867254 
336023 EOS35954 CH22_3406FG_669J2_UNK.DJ32I10.GENSCAN.9-17 

CH22_FGENES669_12 
323479 EOS23410 AA278246 EST cluster (not In UniGene) 

336090 EOS36021 CH22J477FG.6B9JLUNK_DJ32I10.GENSCAN.23.20 

CH22_FGENES.689_2 
311192 EOS11123 AW237220 HsJ11130 ESTs 
335081 EOS35012 Oi22J409FG_4B8_4_Ur^EKtAC005500.G_NSCAN.384^ 

CH22_FGENES.488_4 
309519 EOS09450 AW148940 Hs.246647 EST 
321172 EOS21103 H49160 Hs.133472 ESTs 

301976 EOS01907 T97905 EST cluster (not In UniGene) with exon hit 

323012 EOS22943 AIB32201 Hs.211469 ESTs 
319528 EOS19459 R08673 Hs.177514 ESTs 

329838 EOS29769 c14_p2gi|6672062|emb|gn 2 + 33990 34098 ex 3 4 CDSi 9.11 109 2222 

CU14_p2gl|6672062 

302623 EOS02554 AB019571 EST duster (not In UniGene) with exon hit 

334433 EOS34364 CH_2_1731FG_3B5_8_UNK.EWAC005SM).GENSCAM249^ 

CH22_FGENES.385_8 

304747 EOS04678 AA577816 EST singleton (not in UniGene) with exon hit 

333270 EOS33201 CH22_513FGJ21_1_UNiC£VtAC005500.(SNSCAN.4.11 

CH2^FGENES.121_1 
307054 EOS06985 AI148181 Hs.176835 EST 
320764 EOS20695 R73070 He.246927 ESTs 

321523 EOS21454 H78472 Ks.191325 ESTs; Weakly similar to cONA EST yk41 4c9.3 comes from this gene [Celegans] 
322114 EOS22045 AA543791 Hs.191740 ESTs 

303582 EOS03513 AA377444 EST cluster (not in UniGene) with exon hit 

322924 EOS22855 AA669253 Hs.193971 ESTs 

311179 EOS11110 AI880843 Hs.223333 ESTs 

318601 EOS16532 T39921 EST cluster (not in UniGene) 

309791 EOS09722 AW276176 Hs.73742 ribosomal protein; large; P0 

333882 EOS33813 Gi22_1 153FG_292_4^UNK_EKtAC005S)aG_NSCAN.15a4 

CH22_FGENES.292_4 
337645 EOS37576 O122L5960FG_UNK_EM:AC0DQ097.GENSCAN.104 

CH22 EMAC000097.GENSCAN.10-8 
335623 EOS35554 CH2^_2983FG_5B4_Z.UNK_EMAC005500.GENSCAN.478-2 

CH22_FGENES.584_2 
314745 EOS14676 AA564489 Hs. 137526 ESTs 
330790 EOS30721 T48536 - Hs.105807 ESTs 
332071 EOS32002 AA598594 Hs.112475 ESTs 
312005 EOS11936 T78450 Hs.13941 ESTs 

330694 EOS30625 AA019806 Hs.108447 spinocerebeilar ataxia 7 (olivopontocerebellar a^ 
330739 EOS30670 AA293477 Hs.227591 ESTs 

303042 EOS02973 AF129532 EST duster (not to UniGene) with exon Wt 

323091 EOS23022 AW014094 Hs.210761 ESTs 

326B20 EOS28751 c_7Jis g!|5B68330|refl gn 1 +90446 90602 ex 3 4 CDSi 10i20 167 5634 

CH.07 hsgl]5868330 
300472 EOS00403 T90622 Hs.82609 liy*oxymethyibilane synthase 
310645 EOS10576 AI420742 Hs.163502 ESTs 
332238 EOS32169 N53480 Hs.108622 ESTs 
300966 EOS00897 AA564740 Hs.25B401 ESTs 

330437 EOS30368 HG2730-HT2827 Fibrinogen, A Alpha Polypeptide, All Splice 2. E 
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302292 EOS02223 AF067797 EST cluster (not hi UniGene) with exon hit 1.1 

330138 EOS30069 c21 _p2gi[4210430!emb|gn 1-22334 22460 ex 3 3 COSf 16,56127105 

CH.21_p2 #210430 1.1 
_ 332952 EOS328B3 CH22_176FG_46_8_UNK_EfAAC00Q097.GENSCAN.2-4 

5 CH22.FGENES.48J 1.1 

319901 EOS19832 T77136 Hs.8765 RNA heTcaswelated protein 1.1 

321166 EOS21097 AA411263 Hs. 128783 ESTs 1.1 
336227 EOS36158 CH22_3625FG_730_2_UNK.DA59HiaGENSCAN.36-2 

CH22_FGENES.730_2 1.1 

10 302332 EOS02263 A1B33168 HS.1B4507 Homo sapiens Chromosome 16 BAC clone CIT987SK-4-328A3 1,1 

313800 EOS13731 AW296132 Hs.166674 ESTs 1.1 
339356 EOS39287 CH22_8326FG__UNK.BA354I12.GENSCAN.31-1 

CH22_BA354I12.GENSCAN.31-1 1.1 

324512 EOS24443 AW502125 EST duster (not In UniGene) 1.1 

15 319235 EOS19166 F11330 Hs.177633 ESTs 1.1 

320352 EOS20283 Y13323 Hs.145296 dbintegrin protease 1.1 
338316 EOS38247 CH22_6944FG_UNK„EfltAC005500.GENSCAN.304-2 

CH22.EMAC005500.GENSCAN.304-2 1.1 
333964 EOS33895 CH22_1 241 FG_305_2_UNK_EMACC05500.GENSCAN.1 64-2 

20 CH22_FGENESJ05.2 1.1 

31275B EOS12689 AA721107 Hj*202604 ESTs 1.1 
336178 EOS38109 C^6726FG_UNICEfAAC005500.GENSCAN.21W 

CH22_EMAC00550aGENSCAN.21M 1.1 

_ _ 315199 EOS15130 AA877996 Hs.125376 ESTs 1.1 

25 312321 EOS12252 R66210 Hs.166937 ESTs 1.1 
338765 EOS38696 CH22L7588FG__UNK_EMAC005500.GENSCAR516-1 

CH22_EMAC005500.GENSCAN.518-1 1.1 

330547 EOS30478 U32989 Hs.183671 tryptophan 2;3-ioxygenase 1.1 

315368 EOS15299 AW291563 Hs.152495 ESTs 1.1 
30 328691 EOS28622 c_7_hs gi|6588001|ref] gn 7 - 579598 579654 ex 2 3 COS) 12.78 67 4326 

CR07jtsgi|6588001 1.1 
329179 EOS29110 c_x_hs gl|5868704jrefl gn 2 +181639 181815 ex 3 4 CDS! 0.32177 1939 

CRXJis gl5868704 1.1 
327072 EOS27003 c21 Js gi|6531965|refj gn 55 - 3796429 3797197 ex 4 4 COSf 9.33 769 1270 

35 CR21_hsnj|6531965 1.1 

312056 EOS11987 T83748 Hs.189712 ESTs 1.1 
339128 EOS39059 CH22_8046FG_UNK_DA59H18.GENSCAN.55-2 

CH22_DA59HiaGENSCAN.55-2 1.1 

307646 EOSG7577 AI302236 EST singleton (not in UniGene) with exon hit 1.1 

40 319198 EOS19129 F07354 EST cluster (not In UniGene) 1.1 

338556 EOS38487 CH22_7283FG_UNK_EfAAC005500.GENSCAN.417-8 

CH22_EMtAC005500.GENSCAN.417-B 1.1 

306143 EOS06074 AA916314 EST singleton (not In UniGene) with exon hit 1.1 

332384 EOS32315 M11433 Hs.101850 retinoWsnding protein 1; cellular 1.1 

45 325100 EOS25031 T10265 Hs.116122 EST* Weakly sirnilartocoto for ty^ 1.1 

309839 EOS09770 AW296076 EST singleton (not in UniGene) with exon hit 1.1 

312180 EOS12111 AI248285 Hs.118348 ESTs 1.1 

< 330385 EOS30316 AA449749 Hs.31386 ESTs; Highly similar to secreted apoptosb related protein 1 [Hsapiens] 1.1 

315882 EOS15813 AI831297 Hs.123310 ESTs 1.1 
50 325843 EOS25774 c16Jisg]|6552453)ref]gn 1-71 26 7232 ex 13 CDSI 1.87107 182 

CR16 hsgj|6552453 1.1 

330763 EOS30714 060050 Hs.34812 ESTs ~ 1.1 

317224 EOS17155 D56760 Hs£122 ESTs 1.1 

316042 EOS15973 AW297979 Hs. 170698 ESTs 1.1 
55 333524 EOS33455 C^22_781FG_175_10_UNK_EM^C005500.GENSCAN.53-15 

CH22_FGENES.175_10 1.1 

302357 EOS02288 X03178 Hs.198246 group-specific component (vitamin D binding protein) 1.1 

309830 EOS09761 AW294725 EST singleton (not in UniGene) with exon hit 1.1 

321489 EOS21420 AW392474 Hs.172759 ESTs; Moderately similar to i!!i ALU SUBFAMILY SQ WARNING ENTRY Oil [H-sapiens] 1.1 

60 312304 EOS12235 AA491949 Hs.183359 ESTs 1.1 

322026 EOS21957 AA233527 Hs.213289 low density lipoprotein receptor (famM rn^erchotesterdemia) 1.1 
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TABLE 1A 

Table 1A shows the accession numbers for those primekeys in Table 1 which lack a 
5 unigenelD, Listed for each probeset is the gene cluster (CAT) number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and niRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 
10 column. 



15 



Pkey: Unique Eos probeset identifier number 

CAT number Gene duster number 
Accession: Genbank accession numbers 



Pkey CAT number 



Accession 



20 300611 
301187 
301254 



301454 
25 301615 
301661 



30 



35 



40 



301685 
301804 



301976 
302245 



302292 
302476 



302623 
45 302626 



50 
55 
60 

65 302655 
302758 



337193J 

434061J 

463589J 

468223J 

534162J 

5613J 

7974J 



326972.1 
61_1 



128835.1 
M2J 



27735.1 
31932L3 

9705.1 
10441J 



24028 3 
458.60 



N75450 AA877636 AW137945 W05248 AA514763 AW972399 Af758397 AW195051 

AW976692 AA806542 AA745856 

AI049624 AA814705 AW404856 BE078289 BE078292 

AA829774A1082020 

AI75 1 738 AA977930 

W39477 AKD02047 NNL015515 T58707 AA386214 C19007 AA295466 T49621 T47323 

AKD01735 AF227906 AI815558 AW238991 AL133051 AW272417 AI083492 AI616503 AW888717 AA333166 AI925832 
BE048352 BE048415 AI141922 AW805674 AW805578 AA633581 AA424632 R71439 AW020988 AW976735 AA883247 
W37208 AI091039 AW317020 BE221788 AA502917 AW009024 AI141417 BE349081 A1421443 A1080490 AHJ03921 AI373690 
AI379240 AA424587 AA74C607 AA972391 AA620797 AW271656 AA400517 AI370902 AI680616 AA757270 AA909500 
N32107 R43738 AI270464 AIB70568 AI085139 AA225666 241046 AI767739 AI270546 N56779 
W67730 Z44630 AA490699 W67596 W76661 R21207 

AK001468 AA190315 AA374980 AW961 179 AA307782 AA315295 AA347194 AW953073 AW368190 AW3681 92 AA280772 
AA251247 N85676 A1215522 AI216389 N87835 R12261 R57094AK60045 AA347193R16712AW119006N559O5N87768 
AW900167 AI341261 AI818674 D20285 AI475165 AA300756 R40626 AI122827 AA133250 AI952488 AA970372 AA889845 
AW069517 AI524385 AA190314 A1673359 AA971105 AI351088 AI872789 AI919056 A1611216 AKD01472 BE568761 
AA581004 
T97905AA101672 

H18835 R47363 AI460004 N31660 M454774 AA551759 AJ417040 AA694490 AA633315 AI344661 M708532 AA87B567 
A1802702 AI913465 AW001160 AA932133 A1092908 AA026974 AW628573 AA592910 H18836 AI274428 C00675 AKDD0048 
BE313619 

AF067797 AB013456 NNL001 169 AI791955 AW843925 AI732659 AA577625 AW083143 AW138645 

AF182294 NM.016200 AL046942 AI354410 AI697029 AI859557 AW188855 AW105437 AI358735 AW000959 A1491813 

AW023693 

AW836724 BE243668 AB019571 H43803 

AK001553 AK001951 AB021870 NMJJ16282 F01168 AA211870 AA0788B9 AA312979 AL138385 R70844 AA16565B 
AA007Z79 AA194688 H65871 AA476639 F01095 AA300170 R39487 AA649126 AA193643 AA41 8300 BE173477 N84408 
AW024465 AA406255 BE173412 BE173583 BE173470 AWD69288 AA372937 BE504414 AA209472 AI262833 AI628359 
AI458075 AM76266 AA397706 AI768605 AW243125 AI056436 AA838326 AA810651 AI472025 N35912 AA165622 AI985532 
AI139528 AA626087 W1699B A1632833 AW130827 AW662551 AA731459 AW780188 AI653447 AI694970 AA810662 
AI199987 A1587402 A1492972 H65872 AI805624 AW194835 AW994874 R70790 AA836506 N53285 F00181 R83595 
AI290941 AW936750 AW936703 AW936623 AW936785 AW936691 AW936668 AW936713 AW936788 AW936744 
AW936613 AW936614 AW936665 AW936702 AW936647 AW936643 AW936712 AW936791 AW936624 AW936672 
AW936754 AW936596 AW936802 AW936792 AW936589 AW936692 AW936645 AW936746 AW936801 AW936748 
AW936661 AW936612 AW936697 AW936704 AW936695 AW936626 AW936794 AW936629 AW936577 AW936798 T35617 
AA375943 R29459 AW936717 AA342108 AW963351 Z24876 AW936708 AW3741 10 AW936586 W20080 AW936752 W31803 
AA093709 AA431256 AW803610 AA424959 W76607 AA432267 W72009 R70817 AW778851 AA890563 AA194632 A1089844 
AI373864 AA890333 A1745574 AI095714 AI567507 AI280712 AW864083 AW468991 N48087 AA860500 AA279471 
AA993680 AA676504 AI360949 AI052134 AI038657 AI439836 AA629147 AA651840 AA435925 AA854457 AW796472 
AA838729 AA193407 AA302403 AW958003 AA342107 AA639258 AI43581 1 AA410342 N25790 AA156454 AI539628 
AI275854 N58849 A1858171 AW338576 W15321 AA418342 AA780577 W04701 AA630452 AW769154 AI274286 N23736 
BE465020 A1554346 AI920804 AA969728 AW193440 AI368697 AA115096 AA564981 AA630461 N91475 BE464381 
AA913741 AA757161 Z24907 C00067 AA649290 AI245223 AA363098 AI520754 AA887983 AI273015 AW878871 AW878981 
AA480455 AA709267 AW959521 AW959523 N90014 N32441 F00193 AA115095 AA147583 W19813 AI333349 A1197937 
R39488AW750110 

AJ227892 AA338715 BE074475 BE074469 BE074474 AW00B182 AW572953 AI831725 AI762923 AI341466 AW449335 

BE551685 AI692895 AI040410 A1276881 AI89100B 

AKD01841 H40087H11121 AW408676 N99603 AA984553 H92041 H11226 

AW403330ARJ62097 
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10 



15 



20 



25 



30 



35 



40 



302977 47403.3 



302980 47495J 
303011 41689J 
HJ 



303042 5058.1 



303072 4654J 



303149 
303306 



303443 
303502 
303582 
303610 
303642 
303777 



303874 



97393.1 
11887.1 



224022J 

325188.1 

647662.1 

226089J 

284260.1 

244977.1 

1770217.1 

5013J 



45 303886 81595.1 



304084 
304143 
304165 
304183 
304236 
304350 
304439 
304495 
304518 
304521 
304546 



.-14 



14011.1 



50 



55 



60 



65 



70 



304547 
304636 

75 304677 2822.15 



AW263124 AI925166 AW105732 AA804479 BE621436 AF086399 W79085 W74440 AW992181 AA389686 AA31431 1 

M173955 AA677564 D59895 D60771 AI887733 C14814 AW162193 D81B94 AA732538 AW150919 M748064 AA769465 

AA708143 BE327613 AA092726 AI692476 T35673 233600 AA134036 A1671394 A1267461 AW362795 AI769759 AA909042 

AA130042 AW156938 AI753129 A1246205 AI823883 AI752B36 D60770 AI336386 AI584003 AW627976 AI348676 D59894 

AJ969795AW073259AW00534AJ081318A1082427BE550515 

AI925740 AF086489 W93435 W93345 AA337166 AW966214 AA336257 T11355 AW842435 

AF090405 AF090407 AF090406 

AF1 18395 NWL014317 AW376657 AW848189 A1261617 AI963829 AWB48591 AW848598 AW376696 AW848523 AW848450 
AW848655 AW848183 AW848550 AW376675 AI632752 AI590245 AW31824 A1857990 AI953341 AA888092 AW364968 
AI188545 A1217741 AW275905 AI311481 AI991404 AI364963AA62B392AA927982 AW150563 AA503063 AW079470 
AW512180 AA889371 AW390132 AW609052 AW3901 1 2 AW581780 

AW505345 AF129532 AF126028 AA852108 BE169359 R83701 243904 BE613543 AA283163 AA905463 AW067849 R13544 
R12337 R14020 H98970 AW74918 N56139 AL1 35669 AW067702 AW372065 AW631389 AA083416 AA28751 1 AA602923 
AA488914 AI167215 AW946829 R82855 AI948792 AA371333 AW953883 AW956152 C02539 AA298280 A1932587 AA022742 
A1983021 AA195252 N58991 R78733 AW083996 H39614 AI365249 AW615389 A1927744 AI089971 N52205 AA083417 
BE326666 BE349514 AT743785 AI640148 AI37821 1 AW181A81 AI949484 W31374 AW628233 AA418406 AW068010 
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AA325579 AW961004 BE004785 T73234 AA209403 N54886 
W22152 AV647377 AV647331 AA320693 T79025 F23202 
T81429T95572 T95563 
AA873350TB2429T82428 

R26830 R33029 AA115761 AW1 18148 AI743741 AA954284 AI934165 AKJ88310 A1123759 AW340232 AI089180 AA700861 
AI129973 AI088552 AW963119 AA359516 R33916 
AI817336 R32883 AA595590 AI743065 R31386 

NM.005897 AF1 56857 AA346876 BE545147 AI003306 N45644 AW889728 BE007236 

AB030034 BE304778 NM.016653 AF251441 AA307684 AW957882 AF238255 NM.016595 R57782 AA244505 AA864846 
AW601475 AA232750 AI417539 AA232253 AW294490 AA626441 AW814670 AW814669 N95341 AM23874 M100075 
AW337275 AW804295 AI922069 BE161875 AM70677 AC42841 AA402558 AI435815 AA402496 AI359093 AA505991 
AW197200 AA234622 AA258509 H17033 AI799498 A1263346 AA236466 AA258354 N24807 R14272 AA100160 
AW003360 AI971548 AA017585 X80306 X91 133 AJ276100 X91 132 AF064499 AF064495 AF063768 X83713 AW951310 
AW975565 AA721610 AA715972 

AW975814 AA282765 AA811755 AA731129 BE219297 AA128302 R71285 AA095218 

R78712AA603646R78713 

W16480AA376361 N83837 R81853AA361779 

NWL004751 AF102542AW360893AF038650BE304708 AW360892AW360931 AWB42622AA307800BE292814 AW582119 

AW582122 AW374998 AW374874 AI587061 AA583339 AW662377 AW192901 AW887756 AW887761 AI955582 AI150400 

AA568218 AA583146 AJ832775 AA294858 AI445680 

D59968 D81035 C15620 D80887 D81432 C15618 D60320 D80661 

D62269AW022615 

AA737314 AA682280 AA010792 AA1 43573 AA953433 AA745273 AA188649 AA01 1221 
AA610649A1699484H59558 

BE245833 BE539992 AI380940 AW952644 AA535470 R84610 

AI696519 BE464779 AW298343 BE550149 AW470402 M129660 T78937 AA342648 N71662 H82431 A1302712 AV660681 

R85409 AA962323 AI680732 AA889147 M932629 AW103527 

AA078493 

AJ 2279 00 AI094933 AW0511 19 F00947 

AI874383 AI865710 A1201451 A1659387 U25919 BE093109 AW366305 BE141926 BE141913 AW854334 AW854342 
BE141916 

D59886 AA779752 AB55936 AW976526 AA235034 AA744353 T26888 AA235103 R96569 AA057301 AA057286 D61635 

H87227 

X13075X13076 

R39382 BE467537 AB57156 AI375103 AW021 134 AW79241 BE326541 AW150836 AI684065 R35463 AA678409 AI694321 
AW470057 AW608873 N62359 AI702778 AI701838 A1655208 BE465196 R51845 R38307 AW393336 AW043913 AA782285 
AJ205974L13824 L23311 AI635429 L13826 

AW812795 AA419617 H67827 AW299775 AW382168 AW382133 BE171659 AW392392 BE171641 AA541393 

R59890 R60548 N64863 AI224545 N691 14 AI81 1204 AW51 8902 A11 84866 AI4401 69 AA809472 H63089 AW952971 

AW337382 AI872923 N73882 AA334161 AI537113H63731 A3383952 N41701 

N78520 AW606984 AI287235 AA973956 N49122 

AW089866N63915 

N76794 N94221 W04156 AW897535 

AL137517 BE072492 AI127076 AW196207 AW2949^ 

AF085833 R69689 AW341677 AA923375 BE327566 AW630415 R69601 AW615339 

AF075083H52291H52528 

AF085975 H53458 H53459 

AI357412 AI870708 AI590539 W07459 

W76622AF086372 W72660 
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10 



15 



322518 



38914J 
39354.1 



322610 21773.1 



322694 
20 322735 
322817 
322890 
322933 
322944 
25 322959 
322968 



30 



322331 47467 J AF086467 W81444 W81445 
322340 47509.1 AF088076 W95222 W92523 

322394 2749^1 AW068287 AA310079 BE336702 AA356318 AA306059 AA346785 AW402633 AA311210 AW402909 N76879 AW402913 
AW401920 AA321636 M354474 C17297 C16938 AA311774 M29871 NM_002872Z82188 AW405674 H94176 R89281 
AA214723 AI014482 AW949347 T27749 AW804226 AW796964 AW404581 AF077208 NM_014029 W68830 W79B52 
AA353375 AW575218 AA552192 AA521232 AA702695 AA033975 AW407827 AA829948 N94402 AW628604 AI523308 
N57605 AA641662 H42477 N52784 AI753478 AA768493 AA845729 W47391 N55270 AI0901 17 RB9282 BE206172 
AA076650 AA595650AI21B931 BE049397A1433110 W741 14 H94277 AI358627 AI085221 AIB62818 AA835967 AW103905 
AI640644 AA835507 AA856887 AA694392 AW337542 AI52441 0 BE045500 AI440060 AI358601 AWQ28238 AW205248 
A171 8264 R48618 AA357358 AI595002 AA697549 AW081065 A1433360 AI810783 A1620963 Z82188 AA360224 
AI133446 T50819 AF147343 T50665 

AF155108 AW877241 AW393512 BE160738 AW384889 AW610272 BE160915 BE160774 BE160744 AW836696 AW384919 
AW836739 BE160743 BE160814 AW610275 BE160965 AW580785 AA6B2739 BE 160941 AW821 136 AW592083 AA449860 
AI798661 AA310698 AW302768 AW268932 AW268741 AI250559 AW302879 AW821 181 AW580243 AW384922 AW606812 
AI345641AWB21143AW384B73 

BE242847 AA159840 NM.016216 AF180919 8E262663 BE312610 W53026 BE093965 BE004620 AW992549 M069408 
R66803 BE002445 T80130 N67797 AA765401 AA765829 AW837997 AW837993 AW83801 1 AW838012 AW837996 
AA069435 W521 18 A1457469 AA954977 R39354 
35627.1 NM.014125 AF090919 AF075371 AH 10872 BE070571 

91819.1 AA086123 AA026296 AA086041 AA026297 
449438.1 AA777274 AI761381 A1738617 C02420 
117967J AA649792AA640427 
12735^_1 AA099759AA100511AA687172 
130324.1 AA1 12573 AA1 1 2574 M984323 
132817.1 AI267606 AA121045 AA126521 

17218.2 A1272141 A1879676AF070669 W25179AA534016 AA533386 AA010740 AF124147 W1 6493 W56636 AA25891 1 AA321677 
H44503 AA642777 AA081800 W69885 H82507 AA536128 AA326782 AA326783 AA353693 AA354642 R7331 1 AA354400 
W79820 W16502 AA301647 AI202303 AA453926 AA705795 AA01 1 1 28 AA929033 AI393389 AA845133 A1445640 AI677727 
AJ8182S6 AI369820 AI539292 AI870541 W69797 AI871096 BE550803 N35853 AA644019 N27809 R49769 M738197 
AI565700 AW207656 AA587216 AA669237 AI906947 A1809956 A1740905 AW04381 1 AW162476 AA659844 AI742797 
AI8321 03 AI660967 R86125 AIS74667 AI808074 AI869284 A1336214 AI218002 A1338629 A1857930 AW183986 BE044333 
AW135467 AI826077 AI357643 AM75486 AA478855 AW172550 AI553942 AA868731 AW268850 AI123793 AI887022 
AA046935 AI361954 AI091737 A1682235 A1367076 AI088882 A1808682 AI312679 AA046955 AW027546 AI660019 AI698174 
AW008626 AI256337 AI568959 AW027409 AI040014 AW134559 AA479953 AA910082 A1301458 AW028352 AI017863 

35 A1268915 AI185866 AI265907 AI274195 AW051540 AW027515 AI380435 AA883117 A1279396 AA846628 AW628235 

AW206201 AW628510 AA954276 A1301405 AI827185 AI553978 AI200301 AI470343 AA933953 AI914937 AI362849 
AWOB5066 AI204021 AA631192 AI351701 AA748663AA993806AA580146 AW027744 AA580016 AA897344 AI042638 
AI473196 AA995065AW027720 AI217421 AI935604 AW449411 AW237094 AI653348 NWLOO3107X7O683 AM70473 
AI765137 AI193479 AJ253050 AI470510 AI399828 AI371461 AI185518 N20940 R49816 N79977 W56599 N24649 W78113 

40 N78761 AI817673 AI911482 AW205984 AI240186 AI828016 A1942449 X65661 AW751587 AI392808 AI624192 AI950969 

AA573260 A1203361 AW79942 BE041834 AW305351 AI918327 BE048713 AW071712 BE041565 AI139260 BE466360 
BE502737 AW007819 AW071887 AJ742130A1344020 AW772112 A1932275 AI992189 AI197801 BE219990 A1990863 
AI536934 A1336275 AI971 955 AI798204 AI870429 AI652390 AI080187 BE219486 AI185434 AW628564 AW072399 A1656370 
AI496606 BE041559 AI743591 AW515805 AI087833 AI917506 A1123191 AI858043 A1334046 AI242585 AI636670 AI919478 

45 AW771467 AW17185 AM68527 AW137861 AI554782 AI130733 AW005164 AI910551 AM89135 AI963934 AI985482 AI660396 

A1497963 AW204662 AW137602 AI382505 AI493485 AI185987 A1078841 AI830054 AI378223 AI351299 AA937301 
AA242817 AA258359 AWQ27603 A1935204 A1500360 AI569741 BE551058 AW275536 AI457654 AI142093 AWQ28288 
AI286002 AI2791 14 AI384121 A1341323 AI190436 AW002607 AJ242488 AJ338 122 AI368600 AB4Q276 AM17994 AI190234 
AI275527 AI934886 AI498274 AI813630 AKJ75339 A1087976 AI459251 A1989477 AW004046 AI9921 90 AIB85279 AI479475 

50 AI698030 AI473294 A1951648 AI699587 AI660602 AI873018 AW613987 A1808297 AW270159 AW572955 AW195908 

AW469Q34 AW197100 AA885164 AW611668 AI143038 AI910560 AA418374 AW341092 AI671169 AI937136 AI204003 
AA775707 AW590759 AW593350 AW572981 AI197905 A1660941 A1743469 AW237017 AI808587 AI984962 AA418254 
AI828104 AA625231 A1832151 H84232 A1240215 AI91 1775 AI219668 A1336801 AA232630 AI343471 W69129 N93602 
AA768883 W04386 A1086277 AA983433 W07646 AA458584 N66625 AI384055 AI928089 W25479 AA242952 AI763303 

55 AI225039 AI740896 AA953758 W69240 AA558331 AI760593 AA558712 AW992121 AW992157 W69115 BE328596 AI953190 

W95311 AI950195 AI739605 A1857282 W69185 M884586 AI1 98104 A1127451 AA905932 AA723310 AI936623 AA732940 
AI332918 A1221396 AI336095 A1200067 AI824853 D55893 D52697 D56205 AA232764 T53299 H84555 AA076539 AA15B347 
BE298430 AI134493 AW732398 AW750740 AW578208 N36572 AA453861 AA252914 AA234197 AW576988 AW577034 
AA025199 AW577052 AW385538 AW576996 AW577021 T83230 AA421529 A1918492 AA909038 AA507060 AA654561 

60 AA064597 AW001594 AW469192 AI368002 AI142435 AW379382 W93438 AA076387 AI802344 A1097013 AA987215 

AA635282 W93349 AKJ17818 AA421564 AA156348 AI140004 AA506259 AW473184 AA236350 A1138669 H96873 AA974889 
AA643735 AA995463 AA995471 AA809555 AA253225 AI298682 A1572515 T53300 AA064596 AA193589 AA025 1 1 8 
A1669682 AA610638 T90774 AI972332 A1280776 T27980 AW136058 BE000428 A1378691 AA961520 BE049142 AI31 1424 
AA28321 1 AI344071 AI344007 AI344097 AI582410 AL036314 AW798036 AI905226 C15325 AA360386 AW958417 

65 AW630531 BE538239 T70488AA088296T34175 T31626 D54331 D53142AA029415AW946823AJ914128AA355446 

T34322 BEO06559 M85677 AA034335 T31463 AW804007 AA256591 D551 28 A1535884 D55192 N23605 T31802 AA326899 
AW999156 AA355201 AW999306 AI091590 BE172021 AA029490 BE000255 AW339939 AW150093 A1872098 AI274876 
T06303 AA857909 N23606 AA922714 AI914104 AI285281 AW999919 AI339803 AI081354 AA972184 AI049566 AW151583 
AI682455 AA088257 A1217050 BE551774 AI277033 AI252627 AA910406 AI369422 H46634 AI873113 AA033710 AW078579 

70 AI636452 N23010 AA357263 AA256592 T057B6 AA884195 AA406145 AA907807 AA482840 AI637691 AA654523 AA911495 

T06601 AW594370 AW016524 C15324 AA622519 AA340191 AA174168 C13992 T69433 796576 AW166622 T96575 
323011 139750 J AA5802B8 AA315655 AA133031 AA377748 
323166 162676.1 AA291001 AA188974 AA290616 

323216 6526.1 AA332145 AA331790 AW962563 AA868189 L13837 734468 AA055882 AA096148 AA092327 H57062 R59098 R1 1247 
75 F07659 Z44949 AF131829 L13835 T79889 AA252451 N2B984 H85260 AL046384 AW995631 R58386 AI061651 AW376050 

AW379789 W90347 AA450157 AI799939 AA461340 W02347 AA233095 N39675 AA659441 AW995284 W17060 R32252 



108 



WO 02/21996 



PCT/US01/28716 



323243 140566J 



323244 
323333 
323430 



323465 
323479 
323538 
323032 
323731 
323753 
323835 
323838 
324048 
324231 
324430 
324432 

324456 
324512 
324575 
324609 
324620 



324670 
324692 
324715 

324728 
324783 
324848 
324961 



325071 
325176 



647858.1 

62251J 

63341J 



193343.1 

194627.1 

217887.1 

333100.1 

226193.1 

1246*_4 

506747.1 

243407.1 

267284.1 

975669.1 

312113.1 

312487J 

1155396.1 
1156071 J - 
65704.1 
333046J 



560496.1 
72231.1 
351987.1 
290035J 

210991.2 
389615.1 
371388.1 
376239.1 
22162L1 



1562044.1 
700767.1 



AI042599 AL046385 AI970370 AA744764 A1249761 A1628106 R32668 AIB63011 AI923998 AI186798 N26601 AH 41864 
N34992 AI377031 N23934 AI683466 BE219548 AA622032 AW089867 AA243717 N79547 R59099 AW241293 AI917545 
AW103697 AI383179 AW517527 AW193642 W90348 AW381409 R11195 AA461 166 AA836624 AA28Q285 AW242055 
L13836 N89647 L13834 AI358605 AA452023 AI868391 H57063 AW075868 N20590 Z40695 R37603 R28484 AA251913 
F03914AA055772 N43752 

W47525 AA134047 BE391212 AA330333 AA376355 BE304871 BE167342 H87402AA631722 W45724 AA715517 AI925438 

AI804849 AW241617 AVWQ3807 AI653435 AA134048 AW747874 AI922327 AI814967 AI935895 AA228865 AW504076 

AA225008AW673858C03914 

AW675572 AI248270 T85161 AL133848 T70731 T69747 

AV651680 AA228883 AA367341 AW962458 AA628024 AW172426 AI767785 AA313012 AW963323 

AW062479 AW062488 AW062491 AW062480 AW938564 AW062478 AA322408 AA324351 AW938595 AW938598 BE1623B9 

AW1 76556 AW938599 AW838792 AW938566 BE 162305 BE162377 AW938570 AW062459 AW176555 AW938562 

AW938568 AA251701 BE162320 AW938597 

AA267406 AA261844 AA261845 AA287355 AA810895 

AA278246 AW292815 AA278703 

AW247696BE265140AW403615AL037647AA312336 

AL041844 AL040002 AL039950 

AA323414 AW664013 A1B09377AJ276041 AW296883 AI798340 
AKD02161 AA327102 AI056868 AI743901 AI139018 AI199114 AJ076003 
AL042005 AL042006 AA91 1481 
AA347666 AA346521 AI111169 

AA378739 AW964174 AA570564 A1076833 AW265063 AW006805 AA480656 AW004789 

W60827AL079968ALO47234 

AA464018AA464079AA468142 

AA464510 AA631257 AI740516 AI739132 AW972467 AI741376 AW068935 AI467852 AI752240 AI123717 A1754551 

AW205510AW044211 AW028889AW1 98033 AI538632 AA513096 

AW500954 AW501111 AW501394 

AW502122AW5D2125AW501663AW501720 

AW502257 A1014241 AA 100360 BE298534 

AW299534 AW299896 AA504765 AA505099 AA505100 AA584753 AW136415 AA768306 

BE397649 H14413 SE397669 BE514098 H53372 AA448021 R57944 AI307272 BE259369 H72331 BE251092 T27364 

AA001666 AA044433 AA875998 AW075405 AW338356 AA001 667 AW300173 AW514944 AW468914 AA604673 AA702749 

AA805550 AA447621 AA934104 AJ373527 AA604794 A191 1203 A1500644 A1291 383 AA731 1 33 BE350633 AA044604 H95689 

H14366 AV660983 AA912893 AI369587 AI382271 AA917508 AW138391 BE622560 

AI376331 AI819150 AI097038 AJ351100 AW5046S9 

AW503713 AA352950 AA044972 BE616246 AA335047 AW962269 

AA557952 AA677593 AA618150 

AI739168 AA426249 AJ199636 AW505198 AW977291 AA824583 AA883419 AA724079 AKJ15524 AI377728 AW293682 

AI928140 AA731438 A1092404 AKJ85630 AA731340 

T85872T48305 

AA640770 AI6831 12 AA913009 

AA6Q2539 D59262 AJ684171 N46711 AW021657 D19768 

AA613792 AW182329 T05304 AW858385 

AK001379 AK001411 AW795711 T06997 AA287540 AA354538 AW957773 AI632268 AI651Q03 AI689650 AJ809332 

AW304483 AJ805269 AA278505 AA862381 AA287875 AW628545 AI085761 AW025965 AI658615 AW628879 AW1 39496 

AJ214278 AA902745 AA991679 BE5401 02 AW593658 AI745602 AA744687 A1285441 AA807089 A1218314 AA721449 

AI202987 AA432129 AI285502 AI281462 AA731 31 9 BE082573 

H09693 H09699 T09229 

T19142 AI351168 T52843 BE241963 
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TABLE IB 



Table IB shows the genomic positioning for those primekeys in Table 1 that lack unigene 
ID's and accession numbers. For each predicted exon, the genomic sequence source used for 
prediction is listed. Nucleotide locations of each predicted exon are also listed. 



Pkey: Unique number corresponding to an Eos probeset 

Ref: Sequence source. The 7 digit numbers In this column are Genbank Identifier (Gl) numbers. "Dunham I. et aL" refers to the publication entitled 

The DNA sequence of human chromosome 22/ Dunham I. et aL f Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which axons were predicted. 
NlpostrJon: Indicates nucleotide positions of predicted exons. 



Pkey Ref 

332792 Dunham, I. etal. 

332908 Dunham, I. etal. 

332909 Dunham, L etal. 
332913 Dunham, I. etal. 
332952 Dunham, I. etal. 
332958 Dunham, I. etal. 
332961 Dunham, I. etal. 
332975 Duiiham,l.etal. 
332991 Dunham, I. etal. 
333119 Dunham, I. etal. 
333131 Dunham, I. etal. 
333139 Dunham, I. etal. 
333156 Dunham, I. etal. 
333222 Dunham, I. etal. 
333254 Dunham, I. etal. 

333348 Dunham, I. etal. 

333349 Dunham, I. etal. 
333366 Dunham, I. etal. 

333384 Dunham, I. etal. 

333385 Dunham, I. etal. 
333391 Dunham, I. etal. 
333488 Dunham, I. etal. 
333520 Dunham, I. etal. 
333524 Dunham, I. etal. 
333532 Dunham, I. etal. 
333580 Dunham, I. etal. 
333585 Dunham, I. etal. 
333597 Dunham, I. etal. 
333619 Dunham, I. etal. 
333671 Dunham, I. etal. 
333680 Dunham, I. etal. 
333682 Dunham,!, etal. 

333763 Dunham, I. etal. 

333764 Dunham, I. etal. 

333769 Dunham, I. etal. 

333770 Dunham, I. etal. 
333849 Dunham, I. etal. 
333875 Dunham, I. etal. 
333882 Dunham,!, etal. 
333922 Dunham, I. etal. 
333928 Dunham, I. etal. 
333947 Dunham, I. etal. 
333949 Dunham, I. etal. 
333968 Dunham, I. etal. 
333983 Dunham, I. etal. 
333995 Dunham, I. etal. 
333997 Dunham, I. etal. 
334003 Dunham, I. etal. 
334012 Dunham, I. etal. 
334047 Dunham, I. etal. 
334063 Dunham, 1. etal. 
334066 Dunham, I. etal. 
334078 Dunham, I. etal. 
334118 Dunham, 1. etal. 



Strand 


Nt_position 


Plus 


73381-73768 


Plus 


1934283-1934366 


Plus 


1946582-1946735 


Plus 


1963539-1963843 


Plus 


2472864-2473012 


Plus 


2516164-2516310 


Plus 


2521424-2521555 


Plus 


2599641-2599702 


Plus 


2686938-2687372 


Plus 


3288316-3288640 


Plus 


3350064-3350170 


Plus 


3369495-3369571 


Plus 


3617584-3617790 


Plus 


3979706-3979803 


Plus 


2521424-2521555 


Plus 


4711908-4712181 


Plus 


4713940-4714084 


Plus 


4798273-4798469 


Plus 


4907535-4907610 


Plus 


4907928-4908032 


Plus 


4916697-4916780 


Plus 


5396233-5396310 


Plus 


5586133-5586296 


Plus 


5612620-5612780 


Plus 


5622804-5622937 


Plus 


6142935-6143145 


Plus 


6234778-6234894 


Plus 


6331421-6331536 


Plus 


6562799-6562926 


Plus 


7038849-7039193 


Plus 


7071730-7071794 


Plus 


7076641-7076760 


Plus 


7692491-7692630 


Plus 


7693573-7693716 


Plus 


7696625-7696707 


Plus 


7700384-7700476 


Plus 


8018323-8018472 


Plus 


8135505-8136179 


Plus 


8153002-8153169 


Plus 


8381385-8381444 


Plus 


8468844-8469015 


Plus 


8579888-8579966 


Plus 


8589634-8589791 


Plus 


8681004-8681241 


Plus 


8813593-8813668 


Plus 


8855296-8855424 


Plus 


8866668-8867255 


Plus 


8892882-8892970 


Plus 


9007456-9010221 


Plus 


9428152-94282U 


Plus 


9731991-9732085 


Plus 


9739568-9739680 


Plus 


9809783-9809863 


Plus 


10344273-10344384 
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334122 Dunham, L eta!. Plus 

3341 50 Dunham, I. etal. Plus 

334220 Dunham,!, etal. Plus 

334298 Dunham, I. etal. Plus 

334324 Dunham, L etal. Plus 

334335 Dunham, L etal. Plus 

334433 Dunham, I. etal. Plus 

334532 Dunham, I. etal. Plus 

334561 Dunham, I. etal. Plus 

334616 Dunham, I. etal. Plus 

334628 Dunham, I. etal. Plus 

334630 Dunham, L etal. Plus 

334631 Dunham, I. etal. Plus 
334661 Dunham, I. etal. Plus 
334677 Dunham, I. etal. Plus 
334696 Dunham, I. etal. Plus 
334714 Dunham, I. etal. Plus 
334718 Dunham, I. etal. Plus 
334720 Dunham, I. etal. Plus 
334727 Dunham, I. etal. Plus 

334739 Dunham, I. etal. Plus 

334740 Dunham, I. etal. Plus 
334769 Dunham, 1. etal. Plus 
334872 Dunham, I etal. Plus 
334876 Dunham, L etal. Plus 
334883 Dunham, I. etal. Plus 
334891 Dunham, I. etal. Plus 
334900 Dunham,!, etal. Plus 
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330182 5123954 

327742 5867944 

327805 5867968 

327809 5867968 

5 327814 5867968 

327815 5867968 

327791 5867977 

327745 €531959 

330211 6013592 

10 330207 6013606 

330257 6671881 

330262 6671884 

330286 6671913 

328105 5868020 

15 328113 5868024 

328142 5868050 

328152 5868060 

328170 5868071 

327910 5868162 

20 327919 5868165 

327990 5868218 

328249 6381891 

328251 6381891 

328253 6381894 

25 328084 6469819 

328274 5868219 

328615 5868239 

328632 5868247 

328779 5868309 

30 328783 5868309 

328801 5868321 

328820 5868330 
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TABLE 2 DNA AND PROTEIN SEQUENCES FOR CBF9 AND BFQ8 

Table 2 provides the nucleic acid and protein sequence of the CBF9 and BF08 genes as well 
as the Unigene and Exemplar accession numbers for CBF9 and BF08. 



CBF9 DNA SEQUENCE 



Gene name: 
Unigene number: 
Probeset Accession #: 
Nucleic Acid Accession 
Coding Sequence: 



ESTs 

Hs. 157601 

W07459, 

AC005383 

328-2751 (underlined sequences correspond to start and 
stop codons) 



GACAGTGTTC 
TTTTATTTGC 

CCTGGCGGTA 
ACAAACAGGT 
CCCCCTGGCC 
TCGCCGCTCT 
GTTTTCCTGT 
GAAACCATCG 
ATCATGTTTC 
CACTTTGCCA 
GCATTCCAGT 
CAGGAAGTGA 
CTTGCTCTGA 
CAGATCCTCA 
CAGCTGAAGG 
GAGCTGCATG 
GAGGATGCCA 
ACGCCAGACT 
GAGTTCGCTG 
GCACACTGTC 
AGGACCACCT 
CCAGAAGGAC 
TGTGCCCTGA 
GCGGGCACCA 
GCCGTGCTGA 
CTGGTGGCGG 
GGCATTCCCT 
CGTGGCTTCG 
CTCACTGAGT 
GAGCTGCTCC 
GGCAGCCCAA 
GAGCTGCAGG 
CTCGTCTTCA 
AGCTTTGTGA 
CTGGTGGTGT 
GCTGCGATGC 
ACCGCCCTGC 
GTCCCCAAAG 
GCCCAGAAGC 
AGTGAGGGTC 
GCCGACCTGC 
CCAGTCAACC 
GGGAGCTACC 
TGGAGCTCTT 
ATGGCTCCCG 
GGCACTGAAA 
TTCCCGCCGT 



11 

i 

GCGGCTGCAC 
AGACCTGGGC 
GTTCCTCCGA 
GTCCCACGTG 
CGAGCCGCGC 
CCTTCCGTTA 
TTTCCAGAGT 
GGAAGATTTC 
TGTTAGATGG 
TCACAGTCTG 
TCAGTTCCAC 
AGGCAAGAAT 
AATACCTTCT 
TCATCGTCAC 
AAAGGGGTGT 
CACTGGCCAG 
CCAACGGCCT 
GCAGGGTCGA 
GCAATGCCCC 
CCTTCTACAG 
GCCCAGGCCC 
TGGACGGCTA 
AGCTGAGCCT 
CTCTGGACGG 
GCGAGGACTC 
TGCCTGTGGG 
TCCGTGGTGG 
GGAGCGCCAC 
CACACTCCGA 
TGCTGGGTGT 
AGCATGTGAT 
GGAAGCTGTG 
TGTTGGACAC 
GAAGCTGTGC 
ATGGCAGCCA 
TGCGGGCCAT 
TGCACATCTA 
CTGTGGTGGT 
TGAGGAACAA 
TGCGGAGGCT 
GGTACCACCA 
TCTGCAAACC 
GCTGCAAGTG 
GCTCTGTATG 
TGCAGGAGGG 
TGGTGCCTAC 
GGCCAGGACC 



21 
I 

CGCTCGGAGG 
CGATGCCGCT 
CCTCAGCCGG 
GCAGCCGCGC 
CCGGGTCTGT 
TATCAACATG 
GCCCCCATCT 
AGCTGCCAGC 
GTCTAACAGC 
TGACGGTCTG 
TCCTCATCTG 
CAAGAGGATG 
GCACAGAGGG 
TGATGGGAAG 
CACTGTGTTT 
CGAGCCTAGA 
CTTCAGCACC 
GGCTCACCCC 
ATGCTGGAGA 
CTGGAAGAGA 
CTGTGACTCG 
CCAGTGCCTC 
GGAATGCAGG 
CTTCCTGCGG 
TCGGGCCCGA 
GGAGTACCAG 
CCCCACCCTG 
CAGGACAGGC 
GGATGAGGTT 
AGGCAGTGAG 
GGTCTACTCG 
CAGCCGGCAG 
CTCTGCCTCA 
CCTCCAGTTT 
GGTGCAGACT 
TAGCCAGGCC 
TGACAAAGTG 
GCTCACAGGC 
TGGCATCTCT 
TGCAGGTCCC 
GGACGTGCTC 
CAGCCCGTGC 
TCGGGATGGC 
TGTGAGCCAG 
CAGCAGCCGT 
CTTCTGGAAT 
ACTATTCTCA 



31 

I 

CTGGGTGACC 
TTAAAAAACG 
GTCGGGTCGT 
CCCGGGCGCC 
GAGTAGAGCC 
CCCCCTTTCC 
CTCCCTCTCC 
AAAATGATGT 
GTCGGGAAAG 
GACATCAGCC 
GAATTCCCCT 
GTTTTCAAAG 
TTGCCTGGAG 
TCCCAGGGGG 
GCTGTGGGGG 
GGGCAGCACG 
CTCAGCAGCT 
TGTGAGCACA 
GGATCGCGGC 
GTGTTCCTAA 
CAGCCCTGCC 
TGCCCGCTGG 
GTCGACCTCC 
GCCAAAGTCT 
GTGGGTGTGG 
GATGTGCCTG 
ACGGGCAGTG 
CAGGACCGGC 
GCGGGCCCAG 
GCCGTGCGGG 
GATCCTCAGG 
CGGCCAGGGT 
GTAGGGCCCG 
GAGGTGAACC 
GCCTTCGGGC 
CCCTACCTAG 
ATGACCGTCC 
GGGAGAGGCG 
GTCTTGGTCG 
CGGGATTCCC 
ATTGAGTGGC 
ATGAATGAGG 
TGGGAGGGCC 
GGATGGATTC 
ACCCCTCCCA 
GTCTGTGCCC 
CTGAGGGAGG 



41 



51 



CGCGTAGAAG TGAAGTACTT 60 

CGAGGGGCTC TATGCACCTC 120 

GCCGCCCTCT CCCAGGAGAG 180 

CCTCCTGTGA TCCCGTAGCG 240 

GCCCGGGCAC CGAGCGCTGG 300 

TGTTGCTGGA GGCCGTCTGT 360 

AGGAAGTCCA TGTAAGCAAA 420 

GGTGCTCGGC TGCAGTGGAC 480 

GGAGCTTTGA AAGGTCCAAG 540 

CCGAGAGGGT CAGAGTGGGA 600 

TGGATTCATT TTCAACCCAA 660 

GAGGGCGCAC GGAGACGGAA 720 

GCAGAAATGC TTCTGTGCCC 780 

ATGTGGCACT GCCATCCAAG 840 

TCAGGTTTCC CAGGTGGGAG 900 

TGCTGTTGGC TGAGCAGGTG 960 

CGGCCATCTG CTCCAGCGCC 1020 

GGACGCTGGA GATGGTCCGG 1080 

GGACCCTTGC GGTGCTGGCT 1140 

CCCACCCTGC CACCTGCTAC 1200 

AGAATGGAGG CACATGTGTT 1260 

CCTTTGGAGG GGAGGCTAAC 1320 

TCTTCCTGCT GGACAGCTCT 1380 

TCGTGAAGCG GTTTGTGCGG 1440 

CCACATACAG CAGGGAGCTG 1500 

ACCTGGTCTG GAGCCTCGAT 1560 

CCTTGCGGCA GGCGGCAGAG 1620 

CACGTAGAGT GGTGGTTTTG 1680 

CGCGTCACGC AAGGGCGCGA 1740 

CAGAGCTGGA GGAGATCACA 1800 

ATCTGTTCAA CCAAATCCCT 1860 

GCCGGACACA AGCCCTGGAC 1920 

AGAATTTTGC TCAGATGCAG 1980 

CTGACGTGAC ACAGGTCGGC 2040 

TGGACACCAA ACCCACCCGG 2100 

GTGGGGTGGG CTCAGCCGGC 2160 

AGAGGGGTGC CCGGCCTGGT 2220 

CAGAGGATGC AGCCGTTCCT 2280 

TGGGCGTGGG GCCTGTCCTA 2340 

TGATCCACGT GGCAGCTTAC 2400 

TGTGTGGAGA AGCCAAGCAG 2460 

GCAGCTGCGT CCTGCAGAAT 2520 

CCCACTGCGA GAACCGTGAG 2580 

TTGAGACGCC CCTGAGGCAC 2640 

GCAACTACAG AGAAGGCCTG 2700 

CAGGTCC ETA GA ATGTCTGC 2760 

AGGATGTCCC AACTGCAGCC 2820 
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ATGCTGCTTA 
TTGATGTGTA 
CTGCCACCTT 
CGTTCCTTTG 
AGGCCTTTAC 
GCAGCTTTTC 
CTTGAGGGAC 
GGTCTCAGAC 
TQTGCATGGG 
ACCTTGAAGG 



GAGACAAGAA 
AGTAAATACC 
TCCCTTGAGG 
CACACAATCA 
TAGAGCATCC 
CACTTCCCCA 
GTTTGTGACT 
TGAATGTGAC 
CCCAGGTCTG 
TCTTC 



AGCAGCTGAT 
CACTTTCTGT 
ATAAACAAGG 
ATGCTCGCCA 
TTTGGACGGC 
GAGACATTCT 
TCTTGGCGAC 
CAATTAACCA 
GAGGGCCACG 



GTCACCCACA 
ACCTGCTGTG 
GGTCCTGAAG 
GAATGTTGTT 
GAAGGCCACG 
GGATGCATTT 
TGCCTTTTGT 
GCTTGGTTGA 
TAAAATCGTT 



AACGATGTTG 
CCTTGTTGAG 
ACTTAAATTT 
GACACAGTAA 
GCCTTTCAAG 
GCATTGAGTC 
GTGTGGAAGA 
TGATGGGGGA 
CTGAGTCGTG 



TTGAAAAGTT 
GCTATGTCAT 
AGCGGCCTGA 
TGCCCAGCAG 
ATGGAAAGCA 
TGAAAGGGGG 
GACTTGGAAA 
GGGGCTGAGT 
AGCAGTGTCC 



2860 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 



Gene name: 
Unigene number: 

Signal sequence: 
Transmembrane domains: 
VGW domains: 
EGF domains: 
Cellular Localization: 



CBF9 Protein sequence 

ESTS 

Hs. 157601 
1-17 

none found 

49-223; 341-518; 529-706 
298-333; 715-748 
plasma membrane 



Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I 1 

MPPFLLLEAV CVFLFSRVPP SLPLQEVHVS KETIGKISAA SKMMWCSAAV DIMFLLDGSN 60 

SVGKGSFERS KHFAITVCDG LDISPERVRV GAFQFSSTPH LEFPLDSFST QQEVKARIKR 120 

MVFKGGRTET ELALKYLLHR GLPGGRNASV PQILIIVTDG KSQGDVALPS KQLKERGVTV 180 

FAVGVRFPRW EELHAIiASEP RGQHVLLAEQ VEDATNGLFS TLSSSAICSS ATPDCRVEAH 240 

PCEHRTLEMV REFAGNAPCW RGSRRTLAVL AAHCPFYSWK RVFLTHPATC YRTTCPGPCD 300 

SQPCQNGGTC VPEGLDGYQC LCPLAFGGEA NCALKLSLEC RVDLLFLLDS SAGTTLDGFL 360 

RAKVFVKRFV RAVLSEDSRA RVGVATYSRE LLVAVPVGEY QDVPDLVWSL DGIPFRGGPT 420 

LTGSALRQAA ERGFGSATRT GQDRPRRVW LLTESHSEDE VAGPARHARA RELLLLGVGS 480 

EAVRAELEEI TGSPKHVMVY SDPQDLFNQI PELQGKLCSR QRPGCRTQAL DLVFMLDTSA 540 

SVGPENFAQM QSFVRSCALQ FEVNPDVTQV GLWYGSQVQ TAFGLDTKPT RAAMLRAISQ 600 

APYLGGVGSA GTALLHIYDK VMTVQRGARP GVPKAVWLT GGRGAEDAAV PAQKLRNNGI 660 

SVLWGVGPV LSEGLRRLAG PRDSLIHVAA YADLRYHQDV LIEWLCGEAK QPVNLCKPSP 720 

CMNEGSCVLQ NGSYRCKCRD GWEGPHCENR EWSSCSVCVS QGW1LETPLR HMAPVQEGSS 780 
RTPPSNYREG LGTEMVPTFW NVCAPGP 



BFQ8 DNA SEQUENCE 



Gene name: 
Unigene number: 
Probeset Accession #: 
Nucleic Acid Accession #: 
Coding sequence: 



TMPRSS3a 
HS. 298241 
AI538613 
AB038157 

202-1566 (underlined sequences correspond to start and 
stop codons) 



1 11 21 31 41 51 

I I I I I I 

ACCGGGCACC GGACGGCTCG GGTACTTTCG TTCTTAATTA GGTCATGCCC GTGTGAGCCA 60 

GGAAAGGGCT GTGTTTATGG GAAGCCAGTA ACACTGTGGC CTACTATCTC TTCCGTGGTG 120 

CCATCTACAT TTTTGGGACT CGGGAATTAT GAGGTAGAGG TGGAGGCGGA GCCGGATGTC 180 

AGAGGTCCTG AAATAGTCAC CATGGGGGAA AATGATCCGC CTGCTGTTGA AGCCCCCTTC 240 

TCATTCCGAT CGCTTTTTGG CCTTGATGAT TTGAAAATAA GTCCTGTTGC ACCAGATGCA 300 

GATGCTGTTG CTGCACAGAT CCTGTCACTG CTGCCATTGA AGTTTTTTCC AATCATCGTC 360 

ATTGGGATCA TTGCATTGAT ATTAGCACTG GCCATTGGTC TGGGCATCCA CTTCGACTGC 420 

TCAGGGAAGT ACAGATGTCG CTCATCCTTT AAGTGTATCG AGCTGATAGC TCGATGTGAC 480 

GGAGTCTCGG ATTGCAAAGA CGGGGAGGAC GAGTACCGCT GTGTCCGGGT GGGTGGTCAG 540 

AATGCCGTGC TCCAGGTGTT CACAGCTGCT TCGTGGAAGA CCATGTGCTC CGATGACTGG 600 

AAGGGTCACT ACGCAAATGT TGCCTGTGCC CAACTGGGTT TCCCAAGCTA TGTGAGTTCA 660 

GATAACCTCA GAGTGAGCTC GCTGGAGGGG CAGTTCCGGG AGGAGTTTGT GTCCATOGAT 720 

CACCTCTTGC CAGATGACAA GGTGACTGCA TTACACCACT CAGTATATGT GAGGGAGGGA 780 
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TGTGCCTCTG GCCACG7GGT TACCTTGCAG TGCACAGCCT GTGGTCATAG AAGGGGCTAC 840 

AGCTCACGCA TCGTGGGTGG AAACATGTCC TTGCTCTCGC AGTGGCCCTG GCAGGCCAGC 900 

CTTCAGTTCC AGGGCTACCA CCTGTGCGGG GGCTCTGTCA TCACGCCCCT GTGGATCATC 960 

ACTGCTGCAC ACTGTGTTTA TGACTTGTAC CTCCCCAAGT CATGGACCAT CCAGGTGGGT 1020 

CTAGTTTCCC TGTTGGACAA TCCAGCCCCA TCCCACTTGG TGGAGAAGAT TGTCTACCAC 1080 

AGCAAGTACA AGCCAAAGAG GCTGGGCAAT GACATCGCCC TTATGAAGCT GGCCGGGCCA 1140 

CTCACGTTCA ATGAAATGAT CCAGCCTGTG TGCCTGCCCA ACTCTGAAGA GAACTTCCCC 1200 

GATGGAAAAG TGTGCTGGAC GTCAGGATGG GGGGCCACAG AGGATGGAGC AGGTGACGCC 1260 

TCCCCTGTCC TGAACCACGC GGCCGTCCCT TTGATTTCCA ACAAGATCTG CAACCACAGG 1320 

GACGTGTACG GTGGCATCAT CTCCCCCTCC ATGCTCTGCG CGGGCTACCT GACGGGTGGC 1380 

GTGGACAGCT GCCAGGGGGA CAGCGGGGGG CCCCTGGTGT GTCAAGAGAG GAGGCTGTGG 1440 

AAGTTAGTGG GAGCGACCAG CTTTGGCATC GGCTGCGCAG AGGTGAACAA GCCTGGGGTG 1500 

TACACCCGTG TCACCTCCTT CCTGGACTGG ATCCACGAGC AGATGGAGAG AGACCTAAAA 1560 

ACCTGAAGAG GAAGGGGACA AGTAGCCACC TGAGTTCCTG AGGTGATGAA GACAGCCCGA 1620 

TCCTCCCCTG GACTCCCGTG TAGGAACCTG CACACGAGCA GACACCCTTG GAGCTCTGAG 1680 

TTCCGGCACC AGTAGCAGGC CCGAAAGAGG CACCCTTCCA TCTGATTCCA GCACAACCTT 1740 

CAAGCTGCTT TTTGTTTTTT GTTTTTTTGA GGTGGAGTCT CGCTCTGTTG CCCAGGCTGG 1800 

AGTGCAGTGG CGAAATCCCT GCTCACTGCA GCCTCCGCTT CCCTGGTTCA AGCGATTCTC 1860 

TTGCCTCAGC TTCCCCAGTA GCTGGGACCA CAGGTGCCCG CCACCACACC CAACTAATTT 1920 

TTGTATTTTT AGTAGAGACA GGGTTTCACC ATGTTGGCCA GGCTGCTCTC AAACCCCTGA 1980 

CCTCAAATGA TGTGCCTGCT TCAGCCTCCC ACAGTGCTGG GATTACAGGC ATGGGCCACC 2040 

ACGCCTAGCC TCACGCTCCT TTCTGATCTT CACTAAGAAC AAAAGAAGCA GCAACTTGCA 2100 

AGGGCGGCCT TTCCCACTGG TCCATCTGGT TTTCTCTCCA GGGGTCTTGC AAAATTCCTG 2160 

ACGAGATAAG CAGTTATGTG ACCTCACGTG CAAAGCCACC AACAGCCACT CAGAAAAGAC 2220 

GCACCAGCCC AGAAGTGCAG AACTGCAGTC ACTGCACGTT TTCATCTCTA GGGACCAGAA 2280 

CCAAACCCAC CCTTTCTACT TCCAAGACTT ATTTTCACAT GTGGGGAGGT TAATCTAGGA 2340 

ATGACTCGTT TAAGGCCTAT TTTCATGATT TCTTTGTAGC ATTTGGTGCT TGACGTATTA 2400 

TTGTCCTTTG ATTCCAAATA ATATGTTTCC TTCCCTCAAA AAAAAAAAAA AAAAAAAAAA 2460 
AAAAAAAA 



Gene name: 
Unigene number: 
Probeset Accession #: 
Protein Accession #: 
Signal sequence: 
Transmembrane domains: 
Tryp_SPc domain: 
Cellular Localization: 



BFQ8 Protein sequence: 

TMPRSS3a 

HS. 298241 

AI538613 

BAB20077 

none found 

43-65, 239-261 

216-444 

not determined 



1 11 21 31 41 51 

I I I I I I 

MGENDPPAVE APFSFRSLFG LDDLKISPVA PDADAVAAQI LSLLPLKFFP IIVIGtlALI 60 

LALAIGLGIH FDCSGKYRCR SSFKCIELIA RCDGVSDCKD GEDEYRCVRV GGQNAVLQVF 120 

TAASWKTMCS DDWKGHYANV ACAQLGFPSY VSSDNLRVSS LEGQFREEFV SIDHLLPDDK 180 

VTALHHSVYV REGCASGHW TLQCTACGHR RGYSSRIVGG NMSLLSQWPW QASLQFQGYH 240 

LCGGSVITPL WIITAAHCVY DLYLPKSWTI QVGLVSLLDN PAPSHLVEKI VYHSKYKPKR 300 

LGNDIALMKL AGPLTFNEMI QPVCLPNSEE NFPDGKVCWT SGWGATEDGA GDASPVLNHA 360 

AVPLISNKIC NHRDVYGGII SPSMLCAGYL TGGVDSCQGD SGGPLVCQER RLWKLVGATS 420 
FGIGCAEYNK PGVYTRVTSF LDWIHEQMER DLKT 
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WHAT IS CLAIMED IS: 

1 LA method of screening drug candidates comprising: 

2 a) providing a cell that expresses an expression profile gene selected from the 

3 group consisting of an expression profile gene set forth in Table 1 or Table 2 or fragment 

4 thereof; 

5 b) adding a drug candidate to said cell; and 

6 c) determining the effect of said drug candidate on the expression of said 

7 expression profile gene. 

1 2. A method according to claim 1 wherein said determining comprises 

2 comparing the level of expression in the absence of said drug candidate to the level of 

3 expression in the presence of said drug candidate. 

1 3. A method of screening for a bioactive agent capable of binding to a 



2 colorectal cancer modulator protein (colorectal cancer modulator protein), wherein said 

3 colorectal cancer modulator protein is encoded by a nucleic acid selected from the group 

4 consisting of a nucleic acid of Table 1 or Table 2 or a fragment thereof said method 

5 comprising: 



6 a) combining said colorectal cancer modulator protein and a candidate 

7 bioactive agent; and 

8 b) determining the binding of said candidate agent to said colorectal cancer 

9 modulator protein. 

1 4. A method for screening for a bioactive agent capable of modulating the 

2 activity of a colorectal cancer modulator protein, wherein said colorectal cancer modulator 

3 protein is encoded by a nucleic acid selected from the group consisting of a nucleic acid of 

4 Table 1 or Table 2 or a fragment thereof, said method comprising; 

5 a) combining said colorectal cancer modulator protein and a candidate 

6 bioactive agent; and 
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7 b) detennining the effect of said candidate agent on the bioactivity of said 

8 colorectal cancer modulator protein. 

1 5. A method of evaluating the effect of a candidate colorectal cancer drug 

2 comprising: 

3 a) administering said drug to a patient; 

4 b) removing a cell sample from said patient; and 

5 c) determining the expression of a gene selected from the group consisting of a 

6 nucleic acid of Table 1 or Table 2. 

1 6. A method according to claim 5 further comprising comparing said 

2 expression profile to an expression profile of a healthy individual. 

1 7. A method of diagnosing colorectal cancer comprising: 

2 a) detennining the expression of one or more genes selected from the group 

3 consisting of a nucleic acid of Table 1 or Table 2 or a fragment thereof or a polypeptide 

4 encoded thereby in a first tissue type of a first individual; and 

5 b) comparing said expression of said gene(s) from a second normal tissue type 

6 from said first individual or a second unaffected individual; 

7 wherein a difference in said expression indicates that the first individual has 

8 colorectal cancer. 

1 8. A method for screening for a bioactive agent capable of interfering with the 

2 binding of a colorectal cancer modulator protein (colorectal cancer modulator protein) or a 

3 fragment thereof and an antibody which binds to said colorectal cancer modulator protein or 

4 fragment thereof said method comprising: 

5 a) combining a colorectal cancer modulator protein or fragment thereof, a 

6 candidate bioactive agent and an antibody which binds to said colorectal cancer modulator 

7 protein or fragment thereof; and 

8 b) determining the binding of said colorectal cancer modulator protein or 

9 fragment thereof and said antibody. 
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1 9. A method for inhibiting the activity of a colorectal cancer modulator 

2 protein (colorectal cancer modulator protein), wherein said colorectal cancer modulator 

3 protein is encoded by a nucleic acid selected from the group consisting of a nucleic acid of 

4 Table 1 or Table 2 or a fragment thereof, said method comprising binding an inhibitor to said 

5 colorectal cancer modulator protein. 

1 10. A method according to claim 9 wherein said inhibitor is an antibody. 

1 1 1 . A method of treating colorectal cancer comprising administering to a 

2 patient an inhibitor of a colorectal cancer modulator protein, wherein said colorectal cancer 

3 modulator protein is encoded by a nucleic acid selected from the group consisting of a 

4 nucleic acid of Table 1 or Table 2 or a fragment thereof. 

1 12. A method according to claim 1 1 wherein said inhibitor is an antibody. 

1 13. A method of neutralizing the effect of a colorectal cancer modulator 

2 protein, or a fragment thereof, comprising contacting an agent specific for said protein with 

3 said protein in an amount sufficient to effect neutralization. 

1 14. A method for localizing a therapeutic moiety to colorectal cancer tissue 

2 comprising exposing said tissue to an antibody to a colorectal cancer modulator protein or 

3 fragment thereof conjugated to said therapeutic moiety. 

1 15. The method of Claim 14, wherein said therapeutic moiety is a cytotoxic 

2 agent. 

1 16. The method of Claim 14, wherein said therapeutic moiety is a 

2 radioisotope. 

1 17. A method for inhibiting colorectal cancer in a cell, wherein said method 

2 comprises administering to a cell a composition comprising antisense molecules to a nucleic 

3 acid of Table 1 or Table 2. 

1 1 8 . An antibody which specifically binds to a protein encoded by a nucleic 

2 acid of Table 1 or Table 2 or a fragment thereof. 
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1 19. The antibody of Claim 18, wherein said antibody is a monoclonal 

2 antibody. 

1 20. The antibody of Claim 1 8, wherein said antibody is a humanized 

2 antibody. 

1 21. The antibody of Claim 18, wherein said antibody is an antibody fragment. 

1 22. A biochip comprising one or more nucleic acid segments selected from 

2 the group consisting of a nucleic acid of Table 1 or Table 2 or a fragment thereof, wherein 

3 said biochip comprises fewer than 1000 nucleic acid probes. 

1 23. A nucleic acid having a sequence at least 95% homologous to a sequence 

2 of a nucleic acid of Table 1 or Table 2 or its complement. 

1 24. A nucleic acid which hybridizes under high stringency to a nucleic acid of 

2 Table 1 or Table 2 or its complement. 

1 25. A polypeptide encoded by the nucleic acid of Claim 23 or 24. 

1 26. A method of eliciting an immune response in an individual, said method 

2 comprising administering to said individual a composition comprising the polypeptide of 

3 Claim 25 or a fragment thereof. 

1 27. A method of eliciting an immune response in an individual, said method 

2 comprising administering to said individual a composition comprising a nucleic acid 

3 comprising a sequence of a nucleic acid of Table 1 or Table 2 or a fragment thereof. 

1 28. A method of determining the prognosis of an individual with colorectal 

2 cancer comprising: 

3 a) determining the expression of one or more genes selected from the group . 

4 consisting of a nucleic acid of Table 1 or Table 2 or a fragment thereof in a first tissue type of 

5 a first individual; and 

6 b) comparing said expression of said gene(s) from a second normal tissue type 

7 from said first individual or a second unaffected individual; 
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8 wherein a substantial difference in said expression indicates a poor prognosis. 

1 29. A method of treating colorectal cancer comprising administering to an 

2 individual having colorectal cancer an antibody to a colorectal cancer modulator protein or 

3 fragment thereof conjugated to a therapeutic moiety. 

1 30. The method of Claim 29, wherein said therapeutic moiety is a cytotoxic 

2 agent. 

1 31. The method of Claim 29, wherein said therapeutic moiety is a 

2 radioisotope. 
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